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The problem of amphibian developmental pattern has been attacked by 
many investigators in widely different ways, but the question of the essen- 
tial characteristics of this pattern and of vertebrate developmental pattern 
in general is still under discussion. The following briefly presented data, 
giving evidence of patterns of enzyme activity, are concerned with this 
question. 

Methods.—Intracellular reduction and reoxidation of methylene blue 
and Janus green and formation, reduction and reoxidation of indophenol 
in living intact developmental stages of a urodele, Triturus, and a teleost, 
Oryzias latipes, show the presence of definite oxidation-reduction patterns, 
characteristic for particular developmental stages, and undergoing definite 
changes in the course of development.* Lightly pigmented and occasion- 
ally occurring unpigmented egg masses of Triturus permit direct observa- 
tion of these patterns. Tyriturus embryos remaining within the vitelline 
membrane after removal of the jelly stain very slowiy with oxidized 
methylene blue, but with decrease of free oxygen in the solution by addition 
of a minute amount of sodium hydrosulphite the dye, reduced to the color- 
less “‘leucobase,” penetrates both vitelline membrane and embryo almost 
at once, and with increase of oxygen recoloration occurs. 

The teleost embryos within the chorion stain with oxidized methylene 
blue, but the chorion stains so rapidly and deeply that it may become 
difficult to see the embryonic patterns. Oxidized Janus gen also stains 
through the chorion but when reduced to the colorless form by hydro- 
sulphite it penetrates more rapidly and reoxidizes to red. 

Intracellular dye reduction is brought about in both amphibian and 
teleost by oxygen decrease, either rapidly by hydrosulphite, or more slowly 
by oxygen uptake of embryos sealed in small volume of dye solution. On 
increase of oxygen after reduction reoxidation of intracellular dye with 
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recoloration occurs in a definite differential pattern. In the extremely 
small amounts used, sodium hydrosulphite is not appreciably toxic except 
perhaps after often reneated additions to the dye solution. Reduction 
and reoxidation of methylene blue can be repeated several times without 
affecting further development. 

Intracellular formation of indophenol, deep blue in oxidized form, from 
dimethylparaphenylenediamine (para-aminodimethylaniline) and a-naph- 
thol, catalyzed by an oxidase, is finally toxic or lethal, but with low con- 
centrations of reagents the intracellular reaction pattern becomes clearly 
visible before toxic effects appear. The reaction is of little value for intact 
earlier embryos of Tritwrus because it is extremely slow or inappreciable, 
but it occurs readily in cells adjoining cut or torn surfaces of isolated 
pieces in Holtfreter solution or locally injured regions of these stages, and 
after closure of the neural tube and in young animals after hatching it takes 
place relatively rapidly. Teleost embryos within the chorion show the 
reaction at all stages, but more rapidly later than earlier, and much more 
rapidly in regions or adjoining surfaces where relations of cells have been 
disturbed by cutting, tearing or isolation of pieces, than in intact parts. 
After intracellular formation indophenol is reduced to the colorless form 
and reoxidized with recoloration in the same ways as methylene blue. f 

In living amphibian and teleost material in good condition the gradient 
patterns shown by the indophenol reaction and by reduction of indophenol, 
Janus green and methylene blue coincide in direction and recoloration on 
reoxidation progresses in the reverse direction. 

Pattern in Triturus Development.—In early cleavage reduction of methyl- 
ene blue and Janus green progresses from the apical (animal) region, more 
rapidly on one side, presumably the dorsal side. In blastula stages the 
gradient from the apical region is present and reduction also spreads from 
an area on the presumably dorsal side. As soon as the earliest stages of 
invagination permit identification of dorsal and ventral sides with cer- 
tainty the dorsal lip of the early blastopore appears as the most rapidly 
reducing region of the embryo and reduction progresses anteriorly and 
somewhat laterally from it. This dorsal area of rapid reduction is clearly 
distinguishable to the naked eye and reappears in successive reductions 
until overstaining or hydrosulphite produces toxic effects. It evidently 
coincides approximately with the dorsal inductor region but is without 
distinct boundary, the decrease in reduction toward its anterior and lateral 
border being gradual. In these early gastrula stages the reduction gradient 
from the apical region is slight but still visible when pigmentation is not 
too deep. Dorsally it meets the gradient from the dorsal lip above the 
equator. The reduction pattern in the dorsal inductor region persists dur- 
ing gastrulation but in later gastrula stages usually appears somewhat less 
extensive. As the blastopore progresses from crescentic to circular outline, 
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that is, as invagination extends laterally and ventrally, reduction usually 
appears somewhat more rapid in lateral and ventral lips, but the gradient 
does not extend far laterally and ventrally from the lip, and reduction is 
never as rapid as in the dorsal lip. After closure of the blastopore a small 
area, chiefly anterior to the blastopore region, still shows relatively rapid 
reduction for a time. 

With beginning of neurulation a further change in pattern appears. 
Even before the neural plate is distinctly visible reduction becomes more 
rapid in its anterior region than elsewhere and progresses posteriorly from 
it. As the neural folds appear they become regions of rapid reduction, also 
progressing posteriorly. Reduction outside the neural plate is much less 
rapid and progresses posteriorly and ventrally. In later neurula stages and 
after closure of the neural tube reduction progresses from the anterior head 
region posteriorly, rapidly in the dorsal region and posteroventrally and 
much less rapidly in lateral regions. Early stages of the tail bud are 
associated with a new region of more rapid reduction and with outgrowth 
of the bud this becomes a reduction gradient with high end at the tip. 

Reduction of methylene blue, Janus green and indophenol and the indo- 
phenol reaction become more rapid in early bud stages of the fore leg than 
in surrounding lateral regions. With outgrowth of the leg the almost 
radial gradient pattern of the bud becomes longitudinal in consequence of 
differential growth, with rates of reaction and reduction decreasing from 
the tip. In the outgrowing leg rates of reaction and reduction also de- 
crease from the anterodorsal to the posteroventral side, that is, the de- 
veloping leg apparently retains the anterodorsal-posteroventral gradient of 
the body. Since a longitudinal gradient pattern is also present in it, it 
possesses pattern in three dimensions long before morphological differ- 
entiation is evident and probably from the beginning of its outgrowth. 

In outgrowing gill filaments and balancer rates of reduction and of indo- 
phenol reaction decrease from the tip. In the gill filament and filament 
complex of later stages reduction is more rapid on the arterial than on the 
venous side, probably because of lower oxygen content of blood coming to 
the gill. 

In general, regions which reduce more rapidly show recoloration on 
reoxidation less rapidly than others, that is, reoxidation gradients are the 
reverse of reduction gradients, but all gradient patterns may be partially 
or wholly reversed or obliterated by differential toxic effects of overstain- 
ing, indophenol, long continued low oxygen or hydrosulphite. 

Pattern in Oryzias Development.—With this form intracellular formation, 
reduction and reoxidation of indophenol were used to a greater extent than 
reduction and reoxidation of methylene blue and Jaaus green, but all give 
the same results. In early cleavage and blastoderm stages the central 
region of the blastoderm, where cell division is more rapid, seems to show 








342 ZOOLOGY: C. M. CHILD Proc. N. A. S. 


indophenol reaction and reduction somewhat more rapidly than the 
margins, but the difference is at best slight. Before invagination begins 
one side of the blastoderm reacts and reduces more rapidly than other 
parts. At the earliest stages of invagination it becomes evident that this is 
the side on which invagination begins, and the region of rapid reduction and 
reaction corresponds to the region of rapid reduction indicating the dorsal 
inductor in Triturus. In slightly later stages of invagination this dorsal 
inductor region becomes still more clearly distinguishable with rate of 
reduction and reaction decreasing anteriorly from the lip of the blastopore 
and laterally from the prospective median region of the embryo. With 
development of the germ ring around the yolk, its lateral and ventral 
regions show slightly more rapid reduction and reaction than regions 
anterior to them. 

As the embryonic area or “‘shield’’ forms, reduction and reaction become 
increasingly rapid in its anterior region where the anterior end of the 
embryo will develop. At this stage there are two opposed longitudinal 
gradients in the embryo, one from the dorsal lip, the other from the anterior 
embryonic region. Rates of reduction and reaction also decrease from the 
median region laterally. With progress of germ ring over the yolk and 
closure of the blastopore. th« .:‘erior gradient becomes less distinct and 
disappears, and reduction and reaction progress from the anterior end 
posteriorly, with a short gradient in the opposite direction as the tail 
develops. This anteroposterior gradient, together with progress of reduc- 
tion and reaction from the dorsal region ventrally in head and body, persists 
in later embryonic stages. The dorsiventral gradient shows the greatest 
differences in the head region and appears even in the developing eyes. 
There reduction and indophenol reaction decrease in rate ventrally and 
slightly posteriorly from the dorsal and slightly anterior region and pigment 
development follows the same course. Evidently the eye, as a lateral out- 
growth, like the leg, retains the general gradient pattern of the body. If 
this is the case in the urodele eye, it is probably a factor in determining lens 
regeneration from the dorsal margin of the iris. 

In Oryzias, as in Triturus, cells which have been disturbed in their rela- 
tions by cutting or tearing the embryo during or after remeval from the 
chorion in a Ringer solution, even in stages before gastrulation, and iso- 
lated cell groups and cells show more rapid indophenol reaction than intact 
embryos or parts, suggesting that oxidase activity may have been in- 
creased by the injury. Differences in rate of reduction are more variable 
because torn or cut regions and isolated cell groups react so much more 
rapidly than intact embryos or parts that they become deeply colored and 
may be damaged while reaction is still very slight in the undisturbed cells. 
With intact embryos and isolated parts in separate solutions there is no 
certainty as regards intracellular concentrations of indophenol. However, 
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with more or less similar, apparently non-toxic concentrations, as indicated 
by color, a far from trustworthy indicator in cell masses of different thick- 
ness when viewed by transmitted light, isolated cell groups and cells, and 
cells adjoining torn or cut surfaces often reduce more rapidly than intact 
parts and reoxidize more slowly. 

Discussion.—The indophenol reaction and intracellular reduction and 
reoxidation of indophenol, methylene blue and Janus green in living intact 
individuals all show the presence of definite gradient patterns in early 
development of amphibian and teleost. The more general features of these 
patterns in the two animals are essentially similar as regards their relations 
to embryonic axes and prospective regions. The patterns are present long 
before there is any evidence of morphological differentiation and undergo 
definite changes in the course of development, with localizations of regions 
of accelerated reaction and reduction which later induce or develop par- 
ticular organ systems. All these characteristics indicate that these patterns 
are expressions of essential physiological factors of development, and that 
different specific systems of activities and conditions and qualitative dif- 
ferentiations develop from primarily quantitative gradient patterns. The 
early outgrowing amphibian leg undoubtedly differs specifically in some 
way from early gill filament and balancer, and all of these differ from the 
gastrula, but a quantitative gradient pattern appears in all. The local 
differences characterizing leg, gill filament and balancer originate within, 
and in definite relation to, the gastrula pattern, and the local differences 
within leg, gill filament and balancer arise within, and in definite relation 
to, the gradient patterns of early stages of these organ systems. 

Whether regional differences in reduction, reoxidation and indophenol 
reaction do or do not coincide with regional differences in oxygen uptake 
and CO, production, and whether the assumption is or is not justified that 
respiration of small isolated fragments of embryos remains the same as 
when they were parts of the intact individual, it appears evident that 
respiratory determinations on small isolated fragments do not by any means 
tell the whole story as regards physiological developmental patterns. 
Moreover, the indophenol reaction, and less certainly, reduction and re- 
oxidation, indicate that physiological condition may be considerably 
altered in small isolated pieces of embryos. 


* This study was made possible by the kindness of Dr. V. C. Twitty in making am- 
phibian material available, and of Dr. D. M. Whitaker in permitting use of teleost 
developmental stages. 

+ Using low concentrations of reagents, it has been possible to show presence of defi- 
nite indophenol oxidation-reduction patterns in other living intact organisms. Such 
patterns in the echinoid, Dendraster, have already been described (Proc. Nat. Acad. Sci., 
27, 523 (1941). Data on ciliate Protozoa, and annelid, Nais paraguayensis, develop- 
ment of the starfish, Patiria, the ascidian, Clavellina, and ovaries of Drosophila are 
still unpublished. ; 
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NUCLEOPROTEINS OF CELL NUCLEI 
By A. E. Mirsky AnD A. W. POLLISTER 


HOSPITAL OF THE ROCKEFELLER INSTITUTE FOR MEDICAL RESEARCH, AND DEPARTMENT 
OF ZOOLOGY, COLUMBIA UNIVERSITY 


Communicated August 3, 1942 


We have prepared nucleoproteins from a wide variety of animal cells— 
from mammalian liver, kidney, pancreas, spleen, thymus, brain, from the 
liver, spleen and blood cells of the dogfish, and from the sperm of the trout, 
shad, frog and sea urchin. These nucleoproteins are located in the nuclei 
of the cells from which they are derived. In this paper we shall briefly de- 
scribe the method of preparation, some properties of the nucleoproteins 
and the evidence that they are in fact derived from cell nuclei. 

I. Preparation.—To extract these nucleoproteins from the cell and to 
separate them from other cellular constituents nothing more drastic is 
used than neutral sodium chloride solutions of varying concentrations. 
Before extraction, much cytoplasmic material is removed by thoroughly 
washing the minced tissue with physiological saline. From liver more than 
60 per cent of all the protein present can be removed in this manner with- 
out destroying the main outlines of cell structure. The washed tissue is 
then extracted with 1 1 NaCl (2 M NaCl is needed for extraction of sea- 
urchin sperm). As soon as the more concentrated salt solution is added 
the mixture becomes exceedingly viscous. By centrifugation at high speed 
(10,000 to 12,000 r. p. m.) a viscous, slightly opalescent supernatant fluid 
is obtained. The supernatant fluid is viscous because of the nucleoprotein 
dissolved in it. When this solution is added to six volumes of water the 
nucleoprotein precipitates in a fibrous mass, settling rapidiy so that the 
supernatant fluid can be syphoned off.' The precipitate is washed with 
0.14 M NaCl and then redissolved in 1 M NaCl. The solution is centri- 
fuged at high speed to remove any suspended material. The nucleopro- 
tein is reprecipitated by pouring into six volumes of water. If the mixture 
is stirred with a rod having a crook at its end, the fibrous material generally 
winds around the rod and adheres to it when the rod is transferred to 
another vessel (Plate I). The nucleoprotein is again dissolved, centrifuged 
and precipitated. At this point the preparation is frequently considered 
to be finished, although purification can be carried further. 

For further purification advantage is taken of the unusual solubility of 
the nucleoprotein. It is soluble in 1 M@ NaCl, insoluble in 0.14 M NaCl 
and soluble again when the salt concentration is reduced to approximately 
0.02 M. When a solution of nucleoprotein in 1 M NaCl is placed in a 
cellophane tube and dialyzed against water the nucleoprotein first precip- 
itates and then tends to redissolve as the salt concentration within the 
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cellophane tube continues to drop. The insoluble material is removed by 
centrifuging. The soluble fraction is precipitated by adding enough NaCl 
to bring the concentration to 0.14 M. After centrifugation the precipitate 
is dissolved in 1 M NaCl. Dialysis can be repeated several times, after 
which material insoluble in 0.02 M NaCl is no longer present. 

The quantity of nucleoprotein that dis- 
solves in 0.02 M NaCl varies considerably, 
depending on the source from which it is 
prepared. Practically all of the nucleo- 
protein prepared from sheep spleen redis- 
solves when the solution in 1 M NaCl is 
dtalyzed. None of the nucleoprotein of 
trout sperm remains in solution after dialy- 
sis against water. Between these two ex- 
tremes fall the nucleoproteins prepared 
from other sources. 

Once the nucleoprotein has been dis- 
solved in water (or in 0.02 M NaCl) there 
appear changes in its properties; it becomes 
less viscous, less birefringent when stirred 
and much less fibrous when precipitated 
in 0.14 M NaCl. These changes persist 
even after the nucleoprotein is dissolved in 
1 M NaCl. It may at first appear as if 
these modifications were the result of a 
fractionation in which less viscous, bire- 
fringent and fibrous material is separated 
from: the bulk of nucleoprotein, but this 
interpretation appears unlikely because in 
the case of nucleoprotein prepared from 
sheep spleen virtually all of the nucleopro- 
tein extracted in 1 M NaCl remains in solu- 
tion when the salt is dialyzed away. In 
this case, at least, the changes observed are PLATE I 
not due to fractionation. 

The solubility of the nucleoproteins is probably correlated with the 
salt concentration of the body fluids. The nucleoprotein from mammals 
and fresh-water fishes precipitates from solution in 1.0 M NaCl when diluted 
to 0.14 M (one part of nucleoprotein solution to six parts of water) a final 
concentration isotonic with the blood of these anmals. In elasmobranch 
fishes, however, in which the salt concentration of the blood is equivalent 
osmotically to a 0.50 M sodium chloride solution, the fibres appear when the 
1.0 M solution of nucleoprotein is added to an equal volume of water. 
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II. Properties——The most striking characteristics of these nucleopro- 
teins are their viscosity and birefringence of flow when in solution and 
their fibrous nature when precipitated. These properties indicate that the 
nucleoprotein molecule is markedly elongated. 

The phosphorus content of the nucleoprotein of Arbacia sperm is 3.05 
per cent, of trout sperm 6.55 per cent and of the nucleoproteins prepared 
from mammalian organs between 3.7 and 4.4 per cent. All of this phos- 
phorus is in the form of desoxyribose nucleic acid. The nucleic acid con- 
tent of the nucleoproteins ranges from 31 to 66 per cent. From each 
nucleoprotein the nucleic acid has been isolated and shown to have a com- 
position closely approximating that expected on the tetranucleotide theory. 
These nucleic acids resemble the highly polymerized nucleic acid prepared 
from the thymus gland in showing high viscosity and birefringence of flow 
when in solution and in forming long fibres when precipitated. 

The protein component of the nucleoprotein complex has in each instance 
been prepared. These proteins are histones and protamines. They havea 
high nitrogen content and basic properties. It is noteworthy that no 
tryptophane has been found in any of them. Investigation of the SH 
groups of the histones shows that they are denatured by the method of 
preparation that has been used in the past. 

The ultra-violet absorption spectra of the nucleoproteins show intense 
absorption at 2540 A due to the high concentration of nucleic acid present. 
The protein part of the complex shows in each case (excepting only the 
protein of trout sperm nucleoprotein, which possesses no aromatic amino 
acids) a maximum at 2750 A, slightly removed from the maximum at 2800 
A of a “‘typical protein” such as egg albumin. 

The nucleoproteins prepared from mammalian organs spread at an 
air-water interface at pH 4, if a little heptyl alcohol is added to the solution. 
The thickness of the film so formed when measured by the method of Lang- 
muir and Blodgett is 15-16 A. The protein component alone spreads 
readily, without the use of heptyl alcohol, at pH 8.3 to give a film 7-9 A 
thick, the thickness characteristic of other proteins. The nucleic acid 
component by itself does not spread at an air-water interface. The spread- 
ing experiments on nucleoprotein and on its separated components show 
that under certain conditions protein and nucleic acid combine to form a 
complex. Preliminary experiments show that in the Tiselius electrophore- 
sis apparatus, liver nucleoprotein migrates as one electrically homogeneous 
complex, indicating another condition under which protein and nucleic 
acid are combined. 

There is evidence that the bond between protein and nucleic acid is loose 
in these nucleoproteins. If a solution (in 1 M NaCl) of the nucleoprotein 
prepared from trout sperm is placed in a cellophane tube and dialyzed 
against 1 M NaCl, the protein (a protamine) gradually diffuses through the 
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membrane, leaving the nucleic acid behind, so that after prolonged dialysis 
practically pure nucleic acid remains inside the cellophane tube. Experi- 
ments of a different kind with the other nucleoproteins indicate that in 
them, too, the bond between protein and nucleic acid is loose. 

The quantity of nucleoprotein extracted varies considerably. From 
trout sperm practically all the nucleoprotein present in the cell is extracted 
but only 20 to 25 per cent of the desoxyribose nucleoprotein present in 
mammalian organs is extracted. 

III. Site of Origin of Nucleoproteins.—The preparation of a fibrous 
protein by methods essentially the same as the one we have used has an 
exceedingly long history. On numerous occasions since 1865 fibrous pro- 
teins have been extracted from mammalian cells and tissues by concentrated 
solutions of neutral salts and on each occasion the material has been com- 
pared with fibrinogen or myosin. The last papers on the subject are by 
Bensley'* and by Banga and Szent-Gyérgyi? who, like their predecessors, 
compare these fibrous proteins with myosin, and believe that they form the 
structural framework in the cytoplasm of the particular cells from which 
the protein happened to be prepared. Our investigation of the chemical 
nature of the fibrous material shows it to be a desoxyribose nucleoprotein 
and consequently entirely different from myosin. The nucleoproteins we 
have prepared from various mammalian tissues are essentially the same as 
the nucleohistone extracted from the thymus gland by somewhat different 
methods many years ago by Huiskamp* and Bang.‘ It is with thymus 
nucleohistone rather than with myosin that the fibrous materials we have 
prepared should be compared. Hitherto desoxyribose nucleic acids (those 
giving the Feulgen reaction) have not been detected by histochemical 
methods in the cytoplasm; and they have always been found in the cell 
nucleus. The nucleoproteins of the cytoplasm are widely held to contain 
only the ribose type of nucleic acid. None of our preparations has been 
found to contain ribose nucleic acid—not even those extracted from the 
pancreas, in which direct analyses by previous workers have shown the 
gland to contain several times as much ribose nucleic acid as desoxyribose.° 
Hence there is a strong persumption that the nucleoproteins that can be 
extracted by strong saline solutions are located within the cell nuclei. 
Furthermore, in the lack of tryptophane the protein component differs 
from most cytoplasmic proteins and resembles the basic protamines of fish 
sperm and the histones of thymus nucleoprotein. 

Both spermatozoa and the lymphocytoid cells of the thymus are types 
in which the nucleus makes up the bulk of the cell; and the amount of des- 
oxyribose nucleoprotein that can be extracted from these tissues is so large 
that most of it must have come from the nucleus. For example, the nu- 
cleus is approximately nine-tenths of the cell volume in some fish sperma- 
tozoa; and over 90 per cent of the dry weight of a suspension of spermatozoa 
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is extractable as desoxyribose nucleic acid and protamine. But the source 
of the nucleoprotein cannot be thus determined by calculation alone in 
extractions from the mammalian liver, in which we have found the nucleus 
to be somewhat less than one-tenth of the cell volume. For this reason, 
and especially, moreover, because previous workers, without exception, 
have believed that the fibrous proteins which they extracted with strong 
saline solutions came from the cytoplasm of the cells, we have been led to 
make a direct cytological study of the effects of extraction of the cells. 
Observations of individual spermatozoa (of the sea-urchin, Strongylocen- 
trotus purpuratus, and the keyhole limpet, Megathura) when immersed in 
the extracting fluid (2.0 M NaCl) show that only the nucleus is altered. 
At first the elongate nucleus swells and becomes spherical; as the swelling 
continues the outline of the nucleus becomes less definite; and finally the 
nucleus becomes lost to view. The acrosome, the middle piece with its 
mitochondrial body and the tail (that is, the entire cytoplasm of the sper- 
matozoon) are not visibly changed; and after the disappearance of the nuclei 
one finds detached acrosomes and complete tails, with attached middle 


EXPLANATION OF PLATE II 


All figures are photomicrographs of five micra vertical paraffin sections of 100 
micra slices (cut on a freezing microtome) of liver of guinea-pig; fixed in Zenker’s 
fluid; and stained by identical schedules with Delafield’s hematoxylin and eosin. 
Each of the larger figures (4, 5, 6, 10, 11, 12) is a highly magnified (1360 X) photo- 
graph of individual cells from the less magnified (200 ) figure immediately above 
it. 

Figures 1 and 4. Normal cells, fixed immediately after sectioning on the freez- 
ing microtome; no extraction. 

Figures 2 and 5. Frozen section treated with 0.14 M NaCl for three hours be- 
fore being fixed. This has extracted a large part of the stainable material from 
the cytoplasm. (Negatives and prints of figures 1 and 2, and of 4 and 5 were 
given identical exposures and were developed simultaneously.) 

Figures 3 and 6. Treated with 1.0 M NaCl for five minutes before fixation. 
Nuclei are swollen, no longer stainable, appearing as clear spaces nearly lacking in 
chromatin. 

Figures 7 and 10. Extracted with 1.0 M NaCl for five minutes, then placed in 
0.14 M NaCl (isotonic) for one hour. Compare with figures3 and6. The nuclei 
have shrunk to approximately their original size, and have regained their staining 
capacity though not the original chromatin pattern. 

Figures 8 and 11. Extracted with 1.0 M NaCl for one hour. Nuclei swollen 
and irregular in shape. In many cases nucleus appears to be flowing out of the 
cell, like a fluid droplet, see lower edge of figure 11. 

Figures9and12. Extracted with 1.0 M NaCl for one hour, then treated for one 
hour with 0.14 M NaCl. Compare with figures 8 and 11. The isotonic sodium 
chloride has precipitated the nuclear material (nucleohistone) from its solution 
in strong saline, in the form of long chromophilic fibres between the cells and along 
the margin of the section. In many instances these fibres are directly continuous 
with a mass of similar stainable material in the space originally occupied by the 
nucleus; for example, see the cell on the left of figure 12. 
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pieces, floating in the salt solution. These cytological observations thus 
show that strong saline treatment (the method by which one extracts nucleo- 
protein) causes the nucleus to swell and go into solution. Since no other 
part of the cell is affected it appears certain that all of the nucleoprotein 
that can be extracted from sperm suspensions comes from the cell nuclei. 
We have studied very thoroughly the effect of extraction of nucleopro- 
tein on the liver cells of the guinea-pig. A few confirmatory observations 
have been made on other mammalian organs such as pancreas, testis and 
kidney. For these studies sections of fresh liver, 100 microns thick, were 
cut on a freezing microtome, and placed in a small volume of extracting 
fluid for a time. They were then removed and fixed in Zenker’s fluid; 
and thin vertical sections of the 100 microns slice were cut in paraffin. 
From each experiment one slide was stained progressively with Delafield’s 
hematoxylin and eosin; and a second slide was stained by the Feulgen 
method. The hematoxylin and the nucleal reaction stained the same ma- 
terial, the chromatin, in the comparable slides. Prolonged treatment of 
the slice of fresh tissue with 0.14 M NaCl (used in the preliminary wash of 
our method of extraction) removes much of the stainable material from the 
cytoplasm, but it does not affect the nucleus (compare Plate II, Figs. 2 and 
5 with Figs. 1 and 4). Very brief treatment with 1.0 M NaCl (the nucleo- 
protein-extracting fluid), in striking contrast, causes a conspicuous change 
in the nuclei. They swell slightly, and, surprisingly enough, most of the 
staining capacity disappears (Plate II, Figs. 3 and 6). This is obviously 
a swelling preliminary to solution of the nucleoprotein—like the process we 
have also observed in the spermatozoa (see above). If a slice of tissue in 
which the nuclei have first been swollen by brief treatment with strong 
saline is transferred to 0.14 M NaCl (the nucleoprotein precipitant) the 
swollen or dissolved nucleoprotein, as would be expected, is precipitated in 
. the form of irregular fibrous masses that are stainable with hematoxylin 
and the Feulgen stain (Plate II, Figs.7 and 10). Although with the precip- 
itation of these fibrous masses the nuclei have reappeared (from the stand- 
point of stainability) it is noteworthy that none of the recovered nuclei 
show the uniform, granular pattern of chromatin that characterized them 
before the treatment (compare Plate II, Figs. 7 and 10 with Figs. 1 and 4). 
The strong saline causes much more than a mere swelling of the chromatin 
threads. Quite probably it destroys the intricate arrangement that con- 
stitutes what we know as a thread of chromatin, or a chromosome. Long 
treatment with the strong saline extractant causes the nuclei to become 
greatly swollen, non-stainable, irregular masses—which often extend out- 
side the cell (Plate II, Figs. 8 and 11). . These nuclei quite evidently behave 
as viscous droplets—with much the same physical properties as those of the 
freshly extracted solutions of nucleoprotein. The flowing of the nuclei 
outside the cells—and, indeed, frequently entirely outside the section of 
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tissue—is visible evidence of what must occur during the extraction of a 
mass of minced tissue. From these “‘dissolved’’ nuclei, in and adjacent to 
the sections of tissue, the nucleoprotein can be precipitated in the form of 
long fibres, which are stainable with the nucleal reaction and with hema- 
toxylin (Plate II, Figs. 9 and 12). In these staining capacities the fibres 
agree with the chromatin of the original nucleus from which they were de- 
rived; and it is equally important to note that we have found they share 
these properties also with the fibres that may be precipitated from solutions 
prepared by soaking masses of minced tissue in strong saline. 

Thus, the cytological observations fully agree with the results of the 
chemical analysis in showing the nucleus to be the source of the nucleo- 
protein. From the latter it was evident that the fibrous material extract- 
able with strong sodium chloride solutions is a desoxyribose nucleotide- 
basic protein complex—a type of compound hitherto found only in cell 
nuclei. From cytological study of spermatozoa, liver cells, etc., it is clear 
that the effect of strong sodium chloride solutions is to cause the chromatin 
of the nucleus to swell and go into solution. Furthermore, from the solu- 
tion obtained by extraction of minced tissue, nucleoprotein is precipitated 
by dilution to approximately isotonic strength; and the same treatment of 
a fresh section of tissue causes precipitation of a fibrous mass within the 
viscous droplet that each nucleus becomes when its nucleoprotein is dis- 
solved in strong saline solution. 

IV. Summary and Conclusion.—A method is described for extraction 
of nucleoproteins from cell nuclei of spermatozoa, and of a variety of other 
tissue cells. The nucleoprotein is a highly polymerized desoxyribose tetra- - 
neucleotide in loose combination with basic protein (protamine or histone). 
This nucleoprotein of cell nuclei is thus found to have the composition sug- 
gested by the early and classical studies of Miescher and of Kossel of the 
separate and degraded components of the nucleoproteins of spermatozoa, 
prepared by much more drastic and destructive methods.| The nucleo- 
protein extracted in the present study is in the form of relatively large 
molecules, or particles; and it is therefore probably highly polymerized. 
We suggest that in this form it is much closer to its actual condition in the 
cell nucleus than when in the form in which it has been extracted by other 
methods. The wide applicability of the present method is a unique ad- 
vantage, and one that is most promising; for it makes it appear possible, 
for the first time, to project a study of relatively intact nucleoprotein com- 
plexes on a wide scale. 


In the course of this work we have been fortunate in having the collabora- 
tion of Dr. George Lavin and Dr. Alexandre Rothen of the Rockefeller 
Institute and of Mr. Stanley Walters of the New York State Fish Hatchery 
at Cold Spring Harbor. © 
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* This extraordinarily interesting paper contains a valuable list of references. 

t Kossel believed that basic proteins ‘“‘do not, however, occur in all nuclei, but only 
in the nuclei of certain kinds of tissues.’’* 

1 Bensley, R. R., Anat. Rec., 72, 351 (1938). 

? Banga, I., and Szent-Gyorgyi, A., Enzymologia, 9, 111 (1941). 

8’ Huiskamp, W., Zeit Physiol. Chem., 32, 145 (1901). 

‘ Bang, I., Beitr. Chem. Phys. Path., 4, 115, 331 (1903). 

5 Jorpes, E., Acta Med. Scand., 68, 253, 503 (1928). 

® Kossel, A., The Protamines and Histones, Longmans, Green and Company, 1928, 
page vii. 


A PYRIMIDINE ANALOG OF THIAMINE AND THE GROWTH 
OF FUNGI 


By WILLIAM J. ROBBINS 
New York BotanicaAL GARDEN AND DEPARTMENT OF BOTANY, COLUMBIA UNIVERSITY 
Communicated August 7, 1942 


Through the courtesy of Dr. R. C. Elderfield I received a pyrimidine 
analog of thiamine having the following formula: 








N C—NH;- HBr 
| Br 
CH;—C O—CH,—_—_N-—_—_-C—-CH, 
Po ee 
ae a Cc C—CH,CH.OH 
/\ | 
H | 
eT Se 
PA 
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This compound was synthesized by Miss Yolanda A. Tota. It is stated 
to be at least as stable to heat as thiamine. 

I have tested this compound with three fungi, Phycomyces Blakesleeanus, 
Pythiomorpha gonapodyides and Phytophthora cinnamomi, each of which 
has a different kind of thiamine deficiency. 

Each organism was grown in 25-ml. quantities of a solution containing 
per liter 50 g. dextrose, 1.5 g. KH2PO,, 0.5 g. MgSO,-7H20, 2.0 g. aspara- 
gine and the following mineral supplements in p. p. m.: 0.005 B, 0.02 Cu, 
0.1 Fe, 0.01 Ga, 0.01 Mn, 0.01 Mo and 0.09 Zn. To this solution various 
amounts of the analog, of thiamine, of thiazole or of pyrimidine! were added 
as shown in tables 1 and 2. The solutions were sterilized for 20 minutes at 
12 pounds pressure and each treatment was carried out in triplicate. The 
temperature of incubation was 20°C. Phycomyces was grown for 8 days, 
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Phytophthora for 11 days and Pythiomorpha for 11 days in the first experi- 
ment, table 1, and 15 days in the second experiment, table 2. At the end of 
the period of growth the mycelium was removed from each flask, washed 
with distilled water, dried at 100°C. and weighed. 

The pyrimidine analog was neither beneficial nor detrimental to Phyco- 
myces even when 1000.0 my moles were added per flask (tables 1 and 2). 
The addition of pyrimidine to the basal solution containing 2.0 mu moles 
of the analog did not improve growth. These results are to be expected 
since all previous evidence shows that both pyrimidine and thiazole must 
be available for growth of Phycomyces to occur. Addition of thiazole to 
2.0 my moles of the analog had little effect as may be noted by comparing 


TABLE 1 
Dry WEIGHT OF MYCELIUM PRODUCED IN SOLUTION OF MINERALS, ASPARAGINE AND 
DEXTROSE SUPPLEMENTED AS INDICATED 


AV. DRY WT. PER CULTURE MG. 
ADDITIONS PER FLASK CONTAINING 25 ML. 


OF THE BASAL SOLUTION PHYCOMYCES PyYTHIOMORPHA PHYTOPHTHORA 
None 0.5 1.2 3.8 
0.5 mu mole analog 0.8 1.5 2.2 
1.0 my mole analog 0.1 2.1 0.2 
2.0 mu moles analog 0.2 2.5 2.8 
10.0 mu moles analog 0.7 5.8 2.5 
100.0 mp moles analog 0.4 20.0 0.4 
1000.0 my moles analog 2.9 84.5 0.8 


2.0 mu moles analeg + 
100.0 my moles pyrimidine 1.6 113.9 0.6 
2.0 my moles analog + 


100.0 my moles thiazole 4.3 2.9 0.6 
0.5 mu mole thiamine 77.8 32.9 93.9 
1.0 mz mole thiamine 138.3 61.5 162.4 
2.0 mu moles thiamine 160.5 74.7 150.1 


the growth obtained with that combination and in the solution containing 
2.0 mu moles of thiamine (table 1). However, when larger quantities of the 
analog (10.0, 100.0 or 1000.0 my moles) were used in the presence of 100 
my moles of thiazole considerable growth developed (table 2). The dry 
weight with 100.0 my moles of the analog in the presence of 100.0 my moles 
of thiazole was about the same as that found with 0.1 my mole of pyrimi- 
dine and 100.0 my moles of thiazole; 1000.0 mu moles of the analog in the 
presence of thiazole produced more dry weight than 1.0 my mole of pyri- 
midine and 100.0 my moles of thiazole. The analog was ineffective as a 
substitute for thiamine for Phycomyces but it was about 1/199 as effective 
as thiamine as a source of pyrimidine. 

Pythiomorpha grows if furnished with the pyrimidine half of the thia- 
mine molecule; it is able to synthesize the thiazole portion from the sugar, 
minerals and asparagine in the basal medium. The analog was not in- 
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jurious to Pythtomorpha, in fact 10.0 my moles definitely improved its 
growth and 100.0 or 1000.0 my moles were still more favorable (tables 1 
and 2). However, less growth was obtained with 100.0 my moles of the 
analog than with 0.5 my mole of thiamine. In the first experiment some- 
what more growth occurred with 1000.0 my moles of the analog than 
with 2.0 my moles of thiamine (table 1). In the second experiment 1000.0 
my moles gave about the same yield as 2.0 my moles of thiamine and some- 
what more than 0.5 my mole of thiamine or of pyrimidine (table 2). Asa 
source of pyrimidine for Pythiomorpha the analog appeared to be about 
1/509 as effective as thiamine. 


TABLE 2 
Dry WEIGHT OF MYCELIUM PRODUCED IN SOLUTION OF MINERALS, ASPARAGINE AND 
DEXTROSE SUPPLEMENTED AS INDICATED 


Av. DRY WT. PER CULTURE MG. 
ADDITIONS PER FLASK CONTAINING 


25 ML. OF THE BASAL SOLUTION PHYCOMYCES PyYTHIOMORPHA 
None Trace 3.7 
10.0 mz moles analog Trace 7.6 
100.0 mu moles analog Trace 30.7 
1000.0 my moles analog Trace 100.1 
10.0 mu moles analog + 100 

my moles thiazole 9.9 6.0 
100.0 mz moles analog + 100 

my moles thiazole 33.9 39.4 
1000.0 my moles analog + 100 

mu moles thiazole 136.2 96.0 
100.0 my moles thiazole 1.4 0.3 
0.5 my mole thiamine 91.0 56.3 
1.0 mu mole thiamine 127.1 79.7 
2.0 mz moles thiamine 142.8 102.4 
0.1 my mole pyrimidine + 

100 my moles thiazole 33.2 62.9 
0.5 mu mole pyrimidine + 

100 my moles thiazole 115.5 79.3 


Phytophthora will not grow unless supplied with molecular thiamine. 
The analog, even in amounts of 1000.0 my moles per flask was ineffective 
with this organism (table 1). 

The response of the three fungi leads to the following conclusions: The 
pyrimidine analog of thiamine does not replace thiamine in the physiology 
of these organisms. Its effectiveness as a source of pyrimidine for Phyco- 
myces or Pythiomorpha is of the order of !/s09 or '/1900 that of thiamine. The 
action of the analog as a source of pyrimidine might be ascribed to the pres- 
ence of traces of pyrimidine as an impurity or more probably to a slight dis- 
sociation of the compound into its constituents or to decomposition in 
sterilization. Neither Phycomyces nor Pythiomorpha appears able to split 
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the compound and obtain pyrimidine from it. In this respect the pyrimi- 
dine analog differs from the pyridine analog studied earlier;? the latter 
compound was about as effective a source of pyrimidine as thiamine. 


1 The terms pyrimidine and thiazole are used in this paper to refer to the intermediates 
of thiamine, 4-methyl-5-8-hydroxyethy] thiazole and 2-methyl-5-bromomethyl-6-amino- 
pyrimidine hydrobromide. 

2 Robbins, W. J., Proc. Nat. Acad. Sci. 27, 419 (1941). 


ON THE PHYSICAL CHARACTERISTICS OF THE PERSEUS 
CLUSTER OF NEBULAE 


By F. Zwicky 
NORMAN BRIDGE LABORATORY OF PuHysiIcs, CALIFORNIA INSTITUTE OF TECHNOLOGY 
Communicated July 31, 1942 


A. The Radial Distribution of Nebulae in the Perseus Cluster —In a pre- 
vious paper! counts of nebulae in the field which covers the Perseus cluster 
were communicated. These counts refer only to nebulae which can be dis- 
tinguished on limiting exposures taken with the 18-inch Schmidt telescope 
on Palomar Mountain. As was done previously’ with the data on the clus- 
ters in Coma and in Hydra we here compare the distribution of the nebulae 
in the Perseus cluster with the distribution which Emden* ‘ deduced theo- 
retically for the bounded isothermal gravitational gas sphere. In table 1 
are listed the average numbers N, per square degree of the nebulae brighter 
than about the absolyte photographic magnitude M, = — 14.3 as a func- 
tion of the distance r from the center of the Perseus cluster. From the 
numbers JN, previously! given, the numbers JN, listed here are obtained by 
subtracting 3.0 nebulae per square degree which represent the average 
background of the field nebulae in which the Perseus cluster appears im- 
bedded. The reduction of the observed distribution to the standard Em- 
den curve of the projected densities D of the isothermal gas sphere is ac- 
complished by plotting in figure 1 the values of 5.5N, as a function of the 
associated Emden radius 7; which is related to the actual radius vector by 
the equation 


r= an. (1) 


A good fit between the observational data and the theoretical curve is ob- 
tained if the structural length (or structural index) a is set equal to 10/3 
minutes of arc or, in absolute measure, if we take 11 million parsecs as the 
distance of the Perseus cluster 


a = 3.30 X 107? cm. (2) 
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In figure 1 the drawn-out curve represents the radial distribution‘ of the 
projected density D — 37/1000 of that bounded isothermal Emden sphere 
which is obtained from the density D of the infinite sphere by subtracting 
the constant 37/1000. Although this constant must be expected to vary 
from cluster to cluster it is seen from figure 1 that the adoption of the value 


TABLE 1 


RADIAL DISTRIBUTION OF NEBULAE IN THE PERSEUS CLUSTER 


Yr 
IN MINUTES 


OF ARC PER pain DEGREE 5.5Nr "1 1000D 1000D — 37 
5 554 3047 1.5 2430 2393 
10 214 1177 3 1524 1487 
15 145 798 4.5 968 931 
,.20 96 528 6 649 612 
25 59 325 7.5 476 439 
30 52 286 9 366 329 
35 51 281 10.5 305 268 
40 34 187 12 254 217 
45 22 121 13.5 s a 
50 21.4 117 15 195 158 
60 16.0 88 18 157 120 
70 15.7 86.4 21 a . 
80 10.8 59.4 24 
90 12.0 66.0 27 ~ hfe 
100 10.4 57.2 30 92.0 55 
110 5.3 29.2 33 
120 8.0 44.0 36 
130 4 42.4 39 
140 5.2 28.6 42 Sie ia 
150 4.6 25.3 45 63.1 26.1 
160 4.1 22.6 48 
170 1.22 6.7 51 
180 2.30 12.7 54 
190 2.33 12.8 57 8 ve 
200 1.27 7.0 60 49 12.0 
210 1.40 Or f 63 
220 2.93 16.1 66 
230 5.25 28.9 69 
240 0.95 5.2 72 * 
250 1.73 9.4 75 39 2.0 


37/1000 used already in the analysis of the clusters in Coma and in Hydra 
results in an equally satisfactory representation of the nebular counts in 
the Perseus cluster. Because of the logarithmic scale used in figure 1 for 
N,, the fluctuations, which are proportional to (N,/r)' appear inordi- 
nately large for small values of N,. The absolute values of the fluctuations 
are, however, within the theoretically expected range. Like the clusters in 
Coma and in Hydra, the Perseus cluster may consequently be considered as 











VoL. 28, 1942 ASTRONOMY: F. ZWICKY 357 


a large scale assembly of nebulae which is statistically stationary. Basing 
our further considerations on this result we may again, as in the case of the 
clusters in Coma and in Hydra, carry through the quantitative analysis de- 
scribed elsewhere.” Proceeding in this manner we expect to arrive at a cor- 
rect prediction for the magnitude of the velocity dispersion in the Perseus 
cluster in dependence of its structural length a and its central density po. 
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FIGURE 1 


B. Some Physical Characteristics of the Perseus Cluster —In the Emden 
isothermal sphere the projected central density o and the real space density 
po are connected by the equation? 


po = o0/3.03a. (3) 


Choosing the megaparsec as the unit of length, introducing a = 10/3 
minutes of arc = 1/18 degree and op = 3030/5.5 nebulae per square de- 
gree, and finally converting angular measures into megaparsecs we obtain 
for the center of the cluster the density 


Po = 460,000 nebulae per cubic megaparsec. (4) 


This number includes only the nebulae listed in table 1 which are all brighter 
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than about the absolute photographic magnitude M, = —14.3. We de- 
note with #4 the average mass of these nebulae and we obtain for the real 
average central density of matter per cm.’ 


po = 4.6 X 105 4/3 X 1078 = 1.53 x 10-5 HA, (5) 


where 7 is of the order of unity when it is assumed that the actual central 
density of matter is not materially greater than that incorporated in the 
brighter nebulae here considered. As was shown before’ the velocity dis- 
persion (w) ‘? can be calculated from the relation 


(w)'/? = a[12xT'po]'/2 (6) 


where I is the universal gravitational constant. The following table 2 
gives po and yf in dependence of various assumed values for the velocity 
dispersion (w*)'*. The mass of the sun is #4. = 2 X 10% g. 


TABLE 2 


CENTRAL DENSITY po AND AVERAGE Mass MA or THE NEBULAE BRIGHTER THAN M, = 
— 14.3 IN THE PERSEUS CLUSTER IN DEPENDENCE OF THE VELOCITY DISPERSION 


EL < ll po IN G./cM.3 79 / PA © 
250 a x10" 7.1 X 10 
500 3:8:x 10°** 2:8 X 10'* 
750 2.0 XxX. 10-%* 6.4 X 101° 
1000 a0 xX: 30°* 1.1 % 16" 
1250 oe 10> 1.8 X 104 
1500 13 Xi 2.6 X 10 


According to information kindly supplied me by Dr. Hubble the data on the 
radial velocities w, of nebulae in the Perseus cluster are still very scant. 
An estimate on the basis of these data indicates that (w,*)'’* = 350-400 
km./sec. which results in 


(w?)'/? = (3w,2)'/* & 600-700 km./sec. (7) 


Unless faint nebulae or dark matter contribute far more to the mass of the 
Perseus cluster than the nebulae which are brighter than M, = —14.3 the 
average mass of these nebulae becomes of the order of 5 X 10" M,. Al- 
though we cannot at the present time check this conclusion directly we may 
apply the following quantitative test to the interpretation given here of the 
observations on the Perseus cluster by comparing these observations with 
those made on the Coma cluster. If we denote with the indices P and C 
the quantities which refer to the clusters in Perseus and in Coma, respec- 
tively, we have according to (6) 


(w?)p = lap*pop/ac*poc](w*)c. (8) 
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Inserting the numerical values for a,/a,g and pop/poc it follows that 
(w*)'/*persei < 0.62(w*)'/*comac, (9) 


where the ‘‘smaller’’ sign means ‘‘somewhat smaller’’ since the ratio pop/pog 
was set equal to the ratio per unit cube of the nebulae brighter than the abso- 
lute magnitudes M, = —14.3 and M, = —14.5 for the clusters in Perseus 
and in Coma, respectively. Adopting 1100 km./sec. for the dispersion in 
radial velocities in the Coma cluster this leads to 


(w,?)'/*Persei = Somewhat smaller than 680 km./sec. (10) 


for the predicted dispersion in the radial velocities of the nebulae in the 
Perseus cluster, a result which compares favorably with the dispersion of 
400 km./sec. derived by Hubble as a very rough estimate from a few di- 
rectly measured velocities. 


TABLE 3 
PHYSICAL CHARACTERISTICS OF THE CLUSTERS OF NEBULAE IN 
COMA HYDRA PERSEUS 
Distance in 10° light years 45.0 23.8 35.9 
Diameter in 10° light years 4.4 4.7 6.0 
Diameter in minutes of arc 340 680 566 
Total number of nebulae 
brighter than photog. 
mag. my, 670 270 360 
Limiting photog. mag. m, 16.6 16.2 16.5 
Mmax. = photog. mag. of 
brightest nebula 14.1 13.1 13.8 
Mmax. = estimated abso- 
lute photog. mag. . of 
brightest nebula —17.0 —17.0 —17.0 
M, = Mmax. + (mz, — 
Mmax.) —14.5 —13.9* —14.3 
@ = structural index 2’ = 4’ = 3.33’ = 


2.48 X 102%cem. 2.56 X 10cm. 3.30 X 107%cm. 
Number of nebulae per cu- 
bic megaparsec in center 
of cluster with M < M, 2,100,000 880,000 460,000 


* In a previous paper? M, for the Hydra cluster by mistake was given as —13.1 in- 
stead of —13.7 as would have followed from the considerations there used. 


C. Review of Some of the Physical Characteristics of the Clusters of Nebulae 
in Coma, in Hydra and in Perseus.—The clusters in Coma, in Hydra and in 
Perseus are the three largest among the spherically symmetrical clusters 
which are in reach of the 18-inch Schmidt telescope on Palomar Mountain. 
It was shown in a series of investigations? that the observed radial distribu- 
tion curves of the brighter nebulae in these clusters can all be reduced to the 
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same standard curve which represents the radial distribution of the density 
of matter in a bounded isothermal gravitational gas sphere. The two re- 
duction factors involved in the reduction of the observational data to the 
theoretical curve can be expressed in terms of the velocity dispersion in the 
cluster, its central density or the average mass and number per unit volume 
of the nebulae involved. The substitution of observed values for the quan- 
tities mentioned into the theoretical relations results in a satisfactory check 
of these relations and furnishes significant support for the following conten- 
tions: (1) clusters of nebulae represent statistically stationary distribu- 
tions of matter as far as the brighter nebulae are concerned; (2) Newton’s 
law of gravitation as first good approximation satisfactorily represents the 
interactions of nebulae separated by distances of the order of one million 
light years; and (3) the masses of the brighter nebulae are of the order of 
10° AA, to 10" MA,. j 

In table 3 are listed some of the data which are significant in the analysis 
of the large-scale physical constitution of the three clusters of nebulae in 
Coma, in Hydra and in Perseus. 

The diameters for the three clusters given in table 3 refer of course only 
to those distances from the center of the clusters at which the emergence 
of the brighter cluster nebulae from the background nebulae can be statis- 
tically ascertained. When sufficient observations with more powerful 
telescopes are available so that the fainter member nebulae of the clusters 
can also be included in the counts it will probably be found that the diame- 
ters of the clusters are still larger than those given in table 3. 

Unfortunately only a few more spherically symmetrical clusters can be 
reached with the 18-inch Schmidt telescope and these are considerably less 
rich in nebulae than those listed in table 3. The important question of the 
segregation of nebulae of different brightness can also only partly be solved 
until larger Schmidt telescopes are available. In addition, the velocity 
distribution as a function of the brightness of the nebulae must be investi- 
gated. The problems of the segregation of nebulae and of the dispersion 
in the velocities demand also further theoretical clarification since the clas- 
sical statistical mechanics gives no answer to the fundamental questions 
which arise ir connection with gravitational coéperative assemblies which 
are composed of such diverse elements as the nebulae, the stars and the 
constituents of intergalactic and interstellar matter. The uniformity of 
the structure of symmetrical clusters of nebulae observed so far and the 
quantitative agreement of their observed physical characteristics with those 
derived theoretically for the isothermal gravitational “gas’’ sphere suggest 
that the short-time scale associated customarily with the hypothesis of an 
expanding universe will perhaps become the less attractive the more the 
investigations on the large-scale distribution of matter in the universe 


progress. 
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THE EPIDEMIC CURVE 


By EpwIn B. WILSON AND Mary H. BuRKE 
HARVARD ScHOOL oF PuBLic HEALTH 


Communicated July 15, 1942 


In 1929 Soper! developed a theory of the epidemic curve based on tracing 
the rise and fall of the disease by “generations” of successive groups of in- 
fectious cases. If in the ith generation there are C; infectious cases and S; 
susceptibles and if there be a constant number A of susceptibles per in- 
fectious generation coming into the population, his equations to determine 
the number of cases and the number of susceptibles in the next generation 
were’ 





Cia 5; 
"e <a (1a) 
and 
Si41=5-G4itA, (10) 


where m is the number of susceptibles necessary for one old case to pro- 
duce just one new case. In the stationary endemic condition C; +1 = C;, 
S; = m, S;+1 = S; and C;+1= A. The length of the generation was 
taken to be the ‘‘incubation period.” Thus if a fortnight be taken as the 
incubation period for measles and if there are 150 children per fortnight 
coming into the population through birth and growing up (with due allow- 
ance for deaths and emigration and immigration) measles could maintain 
itself in a stationary endemic condition with 150 cases per fortnight, the 
quantity m would have to be determined by enumerating the susceptibles 
in the population under such conditions, and might well be of the order of 
4 to 5 years’ worth of recruits A, which at 150 per fortnight would be 15,600 
to 19,500. As measles does not occur in a steady endemic condition but in 
sharp epidemics, it is necessary according to Soper’s equations that m 
should be variable or that there should be an accumulation of susceptibles 
to a number considerably greater than m before the epidemic, with an al- 
ternative deficit of susceptibles considerably below m after the epidemic. 

Already in February, 1928, Dr. Wade H. Frost when delivering the 
Cutter Lectures in Preventive Medicine at the Harvard Medical School 
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had presented a similar theory based on a somewhat different line of 
thought.* He, too, considered that for the diseases to which his theory 
would be applicable one could think in terms of generations of infectious 
persons C; who being in contact with susceptibles S; for a certain period 
would infect some of them who would then become the new generation of 
infectious persons at an average time later by the ‘incubation period’’; 
but he recognized that in the intermixture of susceptibles and infectious 
some of the susceptibles might have multiple contacts with infectious per- 
sons and yet could develop at most only one infection apiece. If S; were 
the number of susceptibles, p = 1/.S; would be the chance that any particu- 
lar contact would fall upon any particular susceptible and g = 1 — 1/S; 
would be the chance that he would escape. If then there were k; contacts 
made between infectious and susceptibles the chance that a susceptible 
would escape them all would be ¢* and the chance that he would have at 
least one contact would be 1 — g*‘. The number of infected would there- 
fore be S;(1 — ¢). Frost assumed that the number of contacts k; would 
be proportional to the numbers of infectious and of susceptibles jointly, or 
k; = rC,S;. Thus his equations corresponding to Soper’s la and 10 are 


1 rCiSi 
C4 1= si -( = 2) | (2a) 


Si4.=S:-—CGi+itA, (26) 


and 


except that Frost did not allow for the recruitment of susceptibles at the 
rate A per incubation period for the reason that he was satisfied at the 
time to give a theory of the curve of an epidemic so sharp that the number 
of recruits during the epidemic would not materially influence the course 
of the epidemic. 

It is clear that any such theory as that proposed by Soper or by Frost 
cannot be expected to explain in quantitative detail the course of any epi- 
demic; any precise theoretical discussion of the epidemic curve must be 
highly mathematical and difficult and hypothetical. The greatest im- 
mediate value of the development of the theory and of attempts at its 
application to concrete instances must be upon the qualitative side in 
indicating the sorts of things which may happen under various idealized 
conditions. It is noteworthy that Soper’s paper did accomplish instructive 
results and that Frost’s discussion in his lectures here, but principally at 
the Johns Hopkins School of Hygiene and Public Health with his students 
in successive years, has likewise been deemed of great value. It is note- 
worthy also that although the two theories seem to be different as exempli- 
fied in equations (1) and (2) they are in fact pretty much alike. If the 
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number of infectious C; and the contact rate r are small enough so that 
only the first two terms of the expansion of (1 — 1/5,’ need be taken, 
(2a) becomes 


Ci + = S{1 — (1 — rC;)| = 10S; 


and r = 1/m makes (2a) then the same as (la) even though Soper 
apparently did not think of 1/m as a contact rate.‘ 

As the formula (2a) is not easy to compute because of the high powers of 
numbers very near to | to which it leads, it was suggested to Dr. Frost that 
the ‘‘law of small numbers” could be used to modify the formulae without 
material change in the results so long as the number of susceptibles did not 
decline too far. Thus (1 — 1/5)’ = e~"© and 


C41 = S [1 -—e, (3a) 
and 
Si41=5;-G4itA. (36) 


Table 1 gives the calculation of an epidemic where a single infectious case 
Cy) = 1 is introduced into a population of S) = 2000 susceptibles under the 
hypothesis that the rate of effective contact of infectious and susceptibles 
is r = 0.001 (each infectious person averages 2 contacts with susceptibles), 
where the recruitment A is neglected and where the results of formulae (1), 
(2) and (3) are compared, keeping calculations to the nearest integer. 


TABLE 1 


Course oF HYPOTHETICAL EPIDEMIC, Cy = 1, So = 2000, r = 0.001 


m = 1000 r = 0.001 r = 0.001 
GENERATION _ FORMULAE (1) FORMULAE (2) FORMULAE (3) 

1 2 2 2 
2 4 4 4 
3 8 8 8 
4 16 16 16 
5 32 32 32 
6 62 61 61 
7 116 111 111 
8 204 186 186 
9 317 267 268 
10 393 308 309 
11 332 265 267 
12 171 173 173 
13 59 90 89 
14 17 41 40 
15 5 18 17 
16 1 8 7 
17 3 3 
18 BL ere ya 
Total infected 1739 1594 1594 


Residual susceptibles . 261 406 406 
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It is clear that the results of calculations by (2) and (3) are essentially 
identical, and that the elimination of the double contacts makes the epi- 
demic longer, more symmetrical and lower at the peak, and leaves more 
susceptibles untouched at the end.° 
If the formulae (3) are used a very neat restilt may be had for the relation 
between S,/m, the ratio of initial susceptibles to the number m = 1/r, and 
Sz/m, the ratio of the number of residual susceptibles to the same number. 
Indeed 
Ream Sete ™, 2. Gao he (4) 
Multiplying these together and cancelling S;, ..., Sp, 
Spp = Se Ot at... + Ow (5) 
In any epidemic to which the theory applies the initial number of cases Cy 
introduced into the population of susceptibles would be few compared with 
the number around the peak, and the terminal is 0 after the epidemic has 


passed. Hence if Sz be the number of susceptibles left, the total cases 
C+ Ci + ... + C, must be essentially Sy) — Sz, and one has the result 


Sz in sa" — Sz) or Sr a eo Sol — SE/So) (6) 
rSo 
where 7Sp = So/m and rSg = Sz/m. If F = So/m and f = Sz/m, 
f/F ae e Fl — f/F) (7) 


This relation (7) between f and F cannot be solved for either, but a table of 
corresponding values of F and f may be computed and tabulated as in 
table 2. The results of the table are given in the figure. 


TABLE 2 
RELATION BETWEEN THE RATIOS F AND f 

ft _ Se re pa SE I _ SB _ 2 _ SE 

F So m m F So m m 
0.01 4.652 0.04652 0.32 1.676 0.5362 
0.02 3.992 0.07984 0.35 1.615 0. 5653 
0.03 3.615 0.1084 0.40 1.527 0.6109 
0.04 3.353 0.1341 0.45 1.452 0.6533 
0.05 3.153 0.1577 0.50 1.386 0.6932 
0.07 2.859 0.2002 0.55 1.329 0.7307 
0.10 2.558 0.2558 0.60 1.277 0.7663 
0.12 2.409 0.2891 0.65 1.231 0.8000 
0.15 2.232 0.3348 0.70 1.189 0.8322 
0.17 2.135 0.3629 0.75 1.151 0.8630 
0.20 2.012 0.4024 0.80 1.116 0.8926 
0.22 1.941 0.4271 ‘0.85 1.083 0.9210 
0.25 1.848 0.4621 0.90 1.054 0.9482 
0.27 1.794 0.4843 0.95 1.026 0.9746 
0.30 1.720 0.5160 1.00 1.000 1.0000 
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It is clear that if the number of susceptibles at the start were 3.35 m, 
only 4% of the original susceptibles would remain untouched by the epi- 
demic, 96% would contract the disease. In Panum’s Faroe Islands 
epidemic of measles something like 96% of the population of the villages 
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Plot of relation between the fraction S z/So of residual susceptibles 
to initial susceptibles and the multiple So/m which initial susceptibles 
are of m=1/r(scale on the left), with enlargement for a part of the 
range (scale on the right). 


entered by the disease was actually attacked by it.’ It is further clear from 
the table that if S)/m were only about 2, the fraction of the susceptibles 
which would escape would be 20%. For Hedrich’s analysis of measles epi- 
demics in Baltimore,’ the table would not be strictly applicable because 
there was recruitment of the population which for a fairly long-drawn out 
epidemic of some 8 months might not be negligible. For his epidemic of 
1930-1931, the susceptibles at the beginning were 78,968 but they rose to 
81,449 during the initial stages where recruitment exceeded cases; from 
then they fell to 52,111 before the end of the epidemic, rising to 54,408 at 
itsend. The ratio of minimum to maximum is 0.64 whereas that of end to 
beginning is 0.69. If we enter the table with the value Sz/S) = 0.64 we 
find ‘So/m = 1.240 which would give m = 65,300 on the’base Sp = 81,000; 
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if we enter with Sz/So = 0.69 we find S)/m = 1.199 which would give m 
= 66,000 on the basis of Sp = 79,000. These estimates of m are nearly 
alike. There were about 13,000 children coming into the susceptible group 
each year, which means that the number of susceptibles m = 1/r would 
correspond to about 5 years of births, which may not be an unreasonable 
result in view of the certainty that both the theory and the calculations 
can at best be regarded as only approximately representing real conditions. 


1 Soper, H. E., ‘The Interpretation of Periodicity in Disease Prevalence,” J. Roy. 
Statist. Soc. London, 92, 34-61 (1929). 

2 We shall not restate in detail the conditions under which the theory might be con- 
sidered as approximately true, nor at this time enter upon a discussion of the question 
of periodicity. The rigorous equations for a theory involving the basic conceptions and 
restrictions would seem to be these: First, there are at any time a number of infectious 
persons J(#) and a number of susceptibles S(#). Second, the rate of loss of susceptibles is 
—dS/dt and should be set equal to the rate at which susceptibles become infected, 
namely, C(t), less the rate of recruitment of new susceptibles A(#). Third, the rate C(t) 
is taken to be proportional to the product of J(#) and S(t). Fourth, the newly infected 
persons C(t)dt become infectious after a latent time 7 and remain infectious for a time c. 
Hence 


each 3 


we = C(t) — A(t), CY) = rO™IOHSH, I) = f C(t)dt. (A) 


dt t~s=— ¢ 
The factor 7 and the rate of recruitment A are generally taken as constant, though Soper 
shows that probably 7 or his m which is the reciprocal of r has a seasonal variation. 
Here the symbols C and A are rates, instead of being numbers of individuals as in (1). 
The variables C and J may be eliminated to get 


= © 
A-S = 7150) uae © - F) = 180 [Ao — St — 7) + St —71 -o@)|] 
(B) 


which is a differential-difference equation for S. 

3 Dr. Frost’s lectures were on Feb. 2-3, 1928, and the dates of my letters to him were 
Feb. 9, 23 with replies from him dated Feb. 14, Mar. 21. It was in my second letter that 
I suggested the use of the law of small numbers in the way mentioned below. I strongly 
urged Dr. Frost to publish his theory of the epidemic curve, but he thought it too slight a 
contribution.—E. B. W. 

4 If we take Frost’s equations and express the condition for a steady state we have 
CG, = G+1=AandA = S{l — (1 - 1/S)’45|. This equation cannot be solved 
strictly for the relationship rS = 1 with S = m for the steady state, the relation between 
r, A, S for the steady state is more complicated. If we make S constant in (B) we have 
1 = rSo so that it is re which takes the place of r in Frost’s theory or of 1/m in Soper’s; 
this is.quite to be expected because the effective rate of generation of new cases must be 
the product of a contact rate r by a time o available for making contacts. While for 
illustrative purposes to show what sorts of things may happen we may try different 
values of r or 1/m in the Frost or Soper theories, and different values of Cy and Sp, 
and of A if we wish to admit recruitment, it must be remembered that in 
efforts to interpret concrete epidemics by the theory, the quantities r, Co, So, A 
have to be determined from the data and cannot be expected to be determinable except 
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within rather wide limits. As a matter of fact the contact rate r must be highly variable 
within any community because contact within the home, within the school, and within 
the community at large must be at very different rates, so that at best r or m must bea 
sort of over-all community average of such very different rates. 

5 One of the limitations of Soper’s set-up is that if m is sufficiently small, the calcula- 
tion runs into the impossible situation where C; + 1 becomes greater than the remaining 
Si; for example, with Co = 1, So = 2000, and m = 500, the cases in successive genera- 
tions are 4, 16, 63, 243, 814 (by which time S = 860), and the next value of C by (1) 
comes out at around 1400. Frost’s method involving elimination of double contacts 
seems not to suffer from this defect; calculating by the law of small numbers as in (3) 
we find with r = 1/500 for this case the successive values of C as 4, 16, 62, 224, 611, 764, 
250, 27, 2, leaving 40 susceptibles still untouched. The calculations have been carried 
to tenths and then rounded off at the end in tabulating cases in successive generations. 
If we go through the detailed calculation with the Frost formulae (2) we find the same 
integral values of C as with formulae (3). When calculating according to Soper’s for- 
mula (la) we may be doing him an injustice; after giving that formula he shifted 
over to the formula 

Ci+v. %S;.  no.cases next interval _ no. susceptibles at present 





i.e. : = 
Ci—1/, =m’ no. cases last interval m 


“since the change in S; is usually small in the unit interval.” This shift is advantageous 
for the analytical developments upon which he is entering and surely makes no change 
which would not be within the tolerances of approximations in the theory as applied in 
concrete cases. It is impossible to determine from his paper whether he based his 
numerical calculations upon the original form (1a) or upon this modified form. 

6 Under the conditions, the epidemic has to die out, as it would not have to if there were 
recruitment; if one would add the initial cases Cy) which were introduced into the sus- 
ceptible population Sp to So itself and use So + Cp in place of So in (6) it would not be 
necessary to disregard the small number Cy as mentioned in the text. 

7 Panum, P. L., Observations Made During the Epidemic of Measles on the Faroe Islands 
in the Year 1846, Delta Omega Society, 1940, distributed by the American Public Health 
Association, New York, N. Y. 

8 Hedrich, A. W., ‘‘Monthly Estimate of the Child Population Susceptible to Measles, 
1900-1931, Baltimore, Md.,”” Amer. J. Hygiene, 17, 613-636 (1933). 
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AUTOMORPHISMS OF THE DIHEDRAL GROUPS 
By G. A. MILLER 
DEPARTMENT OF MATHEMATICS, UNIVERSITY OF ILLINOIS 


Communicated July 24, 1942 


The group of inner automorphisms of a dihedral group whose order is 
twice an odd number is obviously the group itself while the group of inner 
automorphisms of a dihedral group whose order is divisible by 4 is the 
quotient group of this dihedral group with respect to its invariant subgroup 
of order 2 when the dihedral group is not the four group. It is well known 
that the four group is the only abelian dihedral group and that its group of 
automorphisms is the symmetric group of order 6. All of these automor- 
phisms except the identity are outer automorphisms but this symmetric 
group admits no outer automorphisms. In fact, we shall prove that it is 
the only dihedral group which does not admit any outer automorphisms. 
To emphasize this fact it may be noted here that on page 152 of the Survey 
of Modern Algebra by Birkhoff and MacLane (1941) it is stated that the 
group of symmetries of the square, which is also a dihedral group, admits 
no outer automorphisms. This is obviously not in agreement with the 
theorem noted above. 

To prove that the symmetric group of order 6, which is also the group of 
movements of the equilateral triangle, is the only dihedral group which 
does not admit any outer automorphisms it may first be noted that if a 
dihedral group whose order is twice an odd number admits no outer auto- 
morphisms its cyclic subgroup of index 2 cannot have an order which 
exceeds 3 since in the group of automorphisms of a cyclic group every 
operator of highest order corresponds to every other such operator and 
every operator may correspond to its inverse. If a dihedral group whose 
order is divisible by 4 admits no outer automorphisms it cannot be the 
four group and hence it contains a cyclic subgroup which involves an in- 
variant subgroup of order 2. Its non-invariant operators correspond to 
themselves multiplied by every operator of this subgroup in some auto- 
morphism of the group. That is, the dihedral group of order 6 is the only 
dihedral group which admits no outer automorphisms. 

Since the dihedral group of order 6 involves no invariant operator be- 
sides the identity and admits no outer automorphisms it is its own group of 
automorphisms. To prove that no other dihedral group whose order is 
twice an odd number is its own group of automorphisms it may be noted 
that if such a group has this property its cyclic subgroup of index 2 cannot 
admit any automorphisms besides the one in which each of its operators 
corresponds to its inverse and hence this cyclic subgroup of odd order must 
be of order 3 and hence the group must be the dihedral group of order 6. 
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If a dihedral group whose order is divisible by 4 is its own group of auto- 
morphisms its cyclic subgroup of index 2 may be assumed to have an order 
which exceeds 2 and to admit also no automorphism besides the identity 
and the one in which each of its operators corresponds to its inverse. 
Hence this cyclic group is of order 4 and it results that the octic group and 
the symmetric group of order 6 are the only two dihedral groups which are their 
own groups of automorphisms. 

If a dihedral group whose order is twice an odd number admits the same 
number of outer automorphisms as inner automorphisms its cyclic sub- 
group of index 2 must have the cyclic group of order 4 for its group of auto- 
morphisms and hence it is the metacyclic group of order 20. This results 
from the fact that the group of automorphisms of the cyclic group of prime 
order p is the cyclic group of order p—1. If a dihedral group whose order is 
divisible by 4 admits the same number of outer automorphisms as inner 
automorphisms and is not the four group its cyclic subgroup of index 2 
cannot have a larger order than 4 since its non-invariant operators of 
order 2 are transformed into themselves multiplied by the operators of the 
invariant subgroup of order 2 under its group of inner automorphisms, and 
hence they are transformed into themselves multiplied by the remaining 
operators of this cyclic subgroup by the outer automorphisms of the group. 
Hence it results that there are two and only two dihedral groups which have 
the property that they admit just as many outer automorphisms as inner auto- 
morphisms. One of these is the octic group while the other is the meta- 
cyclic group of order 20. 

It was noted at the opening of this article that the group of inner auto- 
morphisms of a non-abelian dihedral group whose order is divisible by 4 is 
its quotient group with respect to its invariant subgroup of order 2. Hence 
every such dihedral group involves exactly two subgroups which are sepa- 
rately simply isomorphic with their groups of inner automorphisms while 
every other dihedral group except the four group is simply isomorphic with 
its group of inner automorphisms. On the other hand, the group of auto- 
morphisms of every non-abelian dihedral group is known! to be the holo- 
morph of its cyclic subgroup of index 2. There is therefore no upper limit 
to the ratio of the orders of the group of automorphisms of the general 
dihedral group and the order of this dihedral group, and the study of the 
outer automorphisms of a dihedral group is practically reduced to the study 
of the holomorphs of cyclic groups. 

Every abelian group H can be extended by operators which transform 
every operator of H into its inverse so as to obtain a group G of twice the 
order of H known as the generalized dihedral group of H. A necessary and 
sufficient condition that G is abelian is that all of its operators besides the 
identity are of order 2. As the group of inner automorphisms of G is its 
quotient group with respect to its subgroup generated by its invariant 
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operators it results that a necessary and sufficient condition that it is simply 
isomorphic with G is that H is of odd order. It therefore results that a 
necessary and sufficient condition that the group of inner automorphisms 
of the generalized dihedral group is simply isomorphic with itself is the 
same for both the dihedral group and the generalized dihedral group. 

To prove the fact that it is not possible for a generalized dihedral group 
which is not also a dihedral group to be its own group of automorphisms it 
may first be noted that the order of such a generalized dihedral group 
could clearly not be twice an odd number since in that case it would be 
simply isomorphic with its group of inner automorphisms but would 
clearly also admit outer automorphisms since in every such inner auto- 
morphism an operator of odd order corresponds either to itself or to its 
inverse. Since every non-cyclic abelian group involves at least one non- 
cyclic Sylow subgroup and every non-cyclic Sylow group involves a non- 
identity automorphism in which not every operator corresponds to its 
inverse, it results that there is no generalized dihedral group which is not also 
a dihedral group but which is its own group of automorphisms. It was noted 
above that such a group may be its own group of inner automorphisms. 

To prove that the group of automorphisms of every dihedral group whose 
order is twice an odd number as well as the group of automorphisms of 
every non-abelian generalized dihedral group whose order is twice an odd 
number is a complete group, it may first be noted that each of these groups 
of automorphisms is known to be the holomorph of its invariant abelian 
subgroup. Moreover, this holomorph involves no invariant operator be- 
sides the identity. If this group of automorphisms is transformed into 
itself by an operator which is not contained in it the co-set with respect to 
the given group of automorphisms which contains this transforming 
operator contains also an operator which is commutative with every opera- 
tor of the given dihedral group or of the given generalized dihedral group, 
respectively. 

The latter operator can therefore be so selected that its first power which 
appears in the given group of automorphisms is the identity and that it is 
commutative with every operator of this group since no two of the operators 
of this group transform the operators of the given invariant dihedral group 
in the same manner. It may be emphasized here that while a group may 
be its own group of automorphisms there 1s no group whose order exceeds 2 
which is its own holomorph. This results from the fact that the holomorph 
of a group necessarily includes the group and if a group is non-abelian its 
holomorph contains a subgroup which is simply isomorphic with it and is 
composed of operators which are separately commutative with every opera- 
tor of the given group. The operation of forming successive holomorphs 
beginning with a group whose order exceeds 2 therefore leads to an infinite 
system of groups of increasing orders. 
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Since the group of automorphisms of a dihedral group whose order is 
divisible by 4 contains an invariant operator of order 2 it cannot be a 
complete group. Moreover, it does not follow in this case that if an opera- 
tor is commutative with every operator of a dihedrai group it is also 
commutative with every operator of its group of automorphisms, since two 
distinct operators of this group do not necessarily transform the operators 
of the given invariant dihedral group in a different manner when the order 
of this dihedral group is divisible by 4. The automorphisms of the dihedral 
groups are unusually well adapted for the study of the general properties 
of the group of automorphisms of a given group in view of the properties 
noted above. 


1 Miller, Blichfeldt, Dickson, Finite Groups, p. 169 (1916). 
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2 


1. Let(l—x+ tx)" = Dat’, (1—)7-" = LAL [tl <| 
n=0 n=0 
AM S© = AD ay + AP_ 1a, +... 4+4,, 


then the equation 


(l—x+ém)™1—/7'7'= Say F(—m,-—7r; —n—vr; x)t* 
_ (1) 
mentioned in a previous paper,' indicates that 
SO = F(— m, —1r; —n—r; x). 
Hence 
lim SY) = 1. 
This means that the Cesaro sum (C, r) of the series > om is 1. Equation 
(1) still holds for x = 1 if m is a positive integer and we write 


F(— m, —r;—n—7;1)=n!Tn+r— m+ 1)/(n — m)IT(n +r + 1). 
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The corresponding equation 


A® Fa, -—1r; —n—7r; 1) =A, n+a20 
= 0 nt+ta<0 


holds even if r + 1 is a positive integer. The zero value when » + a < 0 
is indicated by Euler’s relation 
F(a, -—r; —n—1r; x)=(1—x)”"~ °F(—n—r—a,—n; —n—r; x). 
2. The same relation in the form 
Fin +1," +1; x +n + 3/2; 1/2) = 
(1 — */s) a F(?/_ + */ax, 1/2 + '/ox; m + '/ex + 8/2; 1/2) 
may be used to obtain an asymptotic expression for the integral 


tr(x) = fo e~** sech a tanh" ada 


considered in a previous paper.* In fact if the foregoing relation is ap- 
plied to the expression 


tn(x) = n!/(x + 1)(x +3)... (x + 2n4+1)F(n+1,n+1; Yox + 
nm -+- */; */s), 


it is found that when 1 is a large positive integer 


ty(x) ~ 27 ~ * Bin + 1, 1/2 + Vax) ~ (2m) ~~ T8/e + Max) 
where B(x, y) is the Beta function. The usual form of Laplace’s method is 
inapplicable to ¢,(x) because the function tanh a has its maximum value 
at one end of the range of integration. The hypergeometric function under 
consideration is also not included in the types F(a, b = n; c; x), F(a = n, 
b =n;c;x), F(a, b+nc; +n; x), which O. Perron’ has studied for large 
positive values of m. These functions are represented by him with the aid 
of generating functions. 
The more general integral of Stieltjes‘ 


tm, n(x) = foe sech™atanh*ada m20,n20 
satisfies the recurrence relations 
(m + n)tm, n+ 1(x) es Nn, n— 1(x) + Xtin, n(x) = 0 


and can be obtained as a coefficient in the expansion of the integral 


I= foe “(cha — zsha)~™” = > (m, n)tm (x)2"/n! 
n=0 
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ae ga. i) 
— 4 1 +1 1 ‘ os 
I (2..) te vet /am,m;*/2% + /am +1;—— 
1 


x+m 





F(1, m; */ax + */em + 1; '/2 + 1/22). 


Hence 

tm, n(x) = n!/(x + m)(x + m+ 2)... (% +m + 2n)Fin+1,m+n; 

‘ae + am +n +1; 1/2) = 28™— "8% — 1B t/a + '/am, nm + 1)F(*/ee + 
/em, '/oe +1 — ‘Vem; ex + '/om + n+ 1; 1/2). 

When 1 is large 


bg, LR 
no 1 (Qn) —*/* Mm = ty *T\(1/ex + 1/9m). 


3. There is an extension of equation (1) involving Appell’s hypergeo- 
metric function of two variables of the first type® 


(1—#*t?~*f1 —(1—x)q- “1 - A — y= 
DAL“ PF(— n; a,b; c; x,y) — |e] < |, [xt] < [1 - dl, 
n=0 5 
lyt| < [1 — dl, 


where 


Fi(d; a, b; c; x,y) = 4 > td, bp + g(a, p)(b, g)x?y?/ 
= q = 
\(c, b + g)(b!)(q!) }. 


The equation indicates that if 


1 — (1 — a7") — 1 — yt}? = Dent 


(1 + «t/(1 — A] *f1 + w#/ —d)~? = Cat” 


m= 0 
then, for the series }> c,, the Ceudivn sum (C, c — 1) is found from 
SE-? = F,(— 0; a,b; ¢; x,y) 
and for the series }> e,, the Cesaro sum (C, c — a — 6 — 1) is found from 
OO ae a OTR OO ie, Oh ey 


1 Bateman, H., Proc. Nat. Acad. Sci., 26, 491-496 (1940). 
2 Bateman, H., Téhoku Math. Jour., 37, 23-38 (1933). 
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3 Perron, O., Heidelberg Akad. Sitzungsber. (1916), 9 Abh. 24 pp.; (1917), 1 Abh., 69 pp 
4 Stieltjes, T. J., Quart. Jour. Math., 24, 370-382 (1890); Oceuvres, 2, 378-391. 
5 Appell, P., and Kampé de Fériet, J., Fonctions hypergéometriques, Paris, 1926. 


AN ORTHOGONAL PROPERTY OF THE HYPERGEOMETRIC 
POLYNOMIAL 


By H. BATEMAN 
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1. Mittag-Leffler’s polynomial g,(z) has the orthogonal property 


0 msn m>O 


S &m(— ix) gn(ix)dx/xsh(x) 
sien = 2/n m=n n>0Q. (1.1) 


This is readily obtained by inverting the integral representation! 
ga(ix) = (1/r) sin (xx) fe™ (tanh '/.u)"du/shu n= 1 (1.2) 


by means of Fourier’s inversion formula. The resulting equation 


cosech u (tanh !/.u)" = 1/2 f'e-™ cosech (rx)g,(ix)dx (1.3) 
then gives the desired relation when sh u e~™ is expanded in powers of 
tanh '/2u. With the notation of the hypergeometric function the orthog- 
onal relation may be written in the form 


JS Fl — m, 1 + ix; 2; 2)F(1 — n, 1 — ix; 2; 2)x 
nee xdx/sh(xx) = 0 meznN 
= 1/m, m=n>0. (1.4) 


2. Amore general relation may be obtained by writing Euler’s integral 
in the form 
F(— n, ix;c;2) Blix, c — ix) = fe™- ‘/teu (1 — 1/92 — 1/o¢ tanh '/2u)" 
Tae du/(2ch!/.u)* (2.1) 


and treating it in much the same way as the integral used to represent 
gn(ix). The result is that if f,(x) = F(— n, 1x; c; 2) where z is real 


S Bite, c — ix)(z — 1)"fin(x)fn(x)dx = 0 mn 
fad = (—)*n!/(c,m)(2 — 1)° + "2 m=n (2.2) 
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where B(p, q) is the Beta function. This includes the relation 


1/5 f° sech (!/29x)Wm(x)w,(x)dx = 0 msn 
cakes =] m=n (2.3) 


satisfied by the polynomial w,(x)defined by means of the generating function 


(1 + #2)-"” exp(— x arctan 1) = }w,(x)t". (2.4) 
n=0 
A polynomial with an orthogonal property for the weight function 
sech (7x) was discovered in 1940 by G. H. Hardy.? My attention to the 
polynomial w,(x) was called by a letter from B. R. Wicker of Loyola Uni- 
versity, Los Angeles. He defined the polynomial in the first place by 
means of a definite integral equivalent to 


i"w,(x) = (1/7) fe-™ sech u tanh" u du (2.5) 


and by a contour integral. From these he obtained the generating func- 
tion and a recurrence relation 


(n +> 1) Wn+1(x) + NW—1(X) + xw,, (x) = 0. (2.6) 


He also noted the existence of an orthogonal relation but did not see the 
relation between his polynomial and that of Hardy. Originally Hardy 
used the notation g,(x) and Wicker the notation Q,,(x) but as both of these 
notations are used for .Legendre functions of the second kind the notation 
w,(x) is preferable. 

3. The polynomial w,(x) may be expressed in terms of my polynomial* 
F,,(z) by means of the equation 


1"Wy(x) = Yum Fm (ix) 
m=0 
where 


fo = Yas Pall): (3.1) 


m=0 


If Fy(z) denotes Pasternack’s polynomial‘ which is such that when 
R(m) > — 1 and |R(z)| < 1 + R(m) 
2™B(1/em + '/2 + 1/at2, 1/em + 1/2 — 1/aiz) Fy (iz) = 
ys dx (3.2) 


e-*2P,, (tanh x)——— 
J. ( Sig 
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we may deduce the orthogonal relation 


oo 


1/ 2 pes 12)dz a ; 
' ch(rz) + cos (mr) , annie 


-_- @ 





= 1/(2n + 1) an’ =n. (3.3) 


When m = 0 we have F;"(z) = F,(z) and the foregoing relation gives the 
known orthogonal property of F,(z). When m = 1 the function F;'(z) 
reduces to the function E,(z) for which an orthogonal relation has not yet 
been found. This polynomial £,(z) was defined by the operational equa- 
tion 


E,(d/du) sech? u = sech? uP, (tanh u). (3.4) 


When m = —'/, the orthogonal relation satisfied by F,(iz) is of the 
same type as that satisfied by w,(z) and, indeed, we have the relation 


F,.-*(!/six) = iw, (x). (3.5) 


4. In the case of Rice’s polynomial’ H,(x, p, v) which is represented by 
the integral 





Hyls, p, Ble, p — 2) = Set + e- P,| 1 - la (4.1) 


the orthogonal relation seems to be 


S Halis, p, v)B(iz, p — iz)Am(— iz, p, v)dz = 0 mAsznN 
—-o T m=—n (4.2) 


where A,(x, p, v) is defined by means of the expansion 


1—4 ¢ 2v Rn se | 
= 2 nv) Ps . 4.3 
(- + 20 — -) (. 4+ 2 — -) LA (x, p, v)P,(u) (4.3) 





It is readily found that 
A,(x, p,v) = (2m + 1) Ee F(x + p,x+1; «+2; v-') — 
x+1 
{n(m + 1)/1!1!} os F(x + p,*% +2; «+ 3; v)+ 
x+2 


{(n — 1)n(n + 1)(n + 2)/2128} <5 Fle + 9,2 +3; x +4; 
by 


ae Le ‘ (4.4) 
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sin r(x — n) 


(x — n) 


orthogonal system for the range — © :, and it is natural to ask whether the 


5. Hardy® has shown that the functions W,(x) = form an 


, 21x \ . : Be 
functions b(x) = sech x F,,{ —— }i-" can be expressed as linear combinations 
Tv 


of the functions W(x) by means of the formula of interpolation 


bn(x) = Wa(#)bn(n). (5.1) 


n=— @ 


This formula has been tested numerically for the case » = 0, x = '/2 
when it becomes 


w.sech x = sin 7x E ise md sech 1 + ed sech 2 — = 











x «x—] x — 4 xw-— 9 





2x 
sech 3 + pe sech4—...| (5.2) 
Using the values sech 1 = .64805 42734 04663 6 

sech 2 = .27111 82739 42365 7 

sech 3 = .09132 79274 19433 4 

sech 4 = .03661 89934 73686 5 

sech 5 = .01334 05293 99091 5 


ch!/, = 1.12762 59652 06381, r = 3.14159 26535 89793 


derived from numbers in the British Association Tables, vol. 1, we find that 
m sech '/; = 2.786 . . . while the right-hand side exceeds 2.791774 when 
only the first three terms are taken into consideration. The complete 
value is greater than this and so this numerical test indicates that the 
supposition (5.1) is false. 


1 Bateman, H., Proc. Nat. Acad. Sci., 26, 491-496 (1940). 

? Hardy, G. H., Proc. Cambridge Phil. Soc., 36, 1-8 (1940). 

3’ Bateman, H., Téhoku Math. Jour., 37, 23-38 (1933); Annals of Math., 35, 767-775 
(1934). 

4 Pasternack, S., Phil. Mag., (7) 28, 209-226 (1939). 

5 Rice, S., Duke Math. Jour., 6, 108-119 (1940). 

6 Hardy, G. H., Proc. Cambridge Phil. Soc., 37, 331-348 (1941). 
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CONTINGENCY TABLES 
By EpwIn B. WILSON AND JANE WORCESTER 
HARVARD SCHOOL OF PUBLIC HEALTH 


Communicated June 27, 1942 


If there be given a fourfold universe such as exists in the case of a popula- 
tion having two characters A and B, with four probabilities p: = paz, 
Pe = Pap, Ps = Pag, Ps = Pag Whose sum is 1 for the presence or absence 
of the characters in pairs, and if a sample of N be drawn from it and tabu- 
lated in a fourfold table with m, = (AB), mz = (aB), ns; = (AB), m4 = (a8) 
in the notation of Yule’ with 2n = N, the chance of any particular table is 


N! m1 me, Ns, Ms 
os Pi po ps “ps 


n;! Nz! n3! ns! 





(1) 


By virtue of the definitions of probability, the probabilities of A and of B 
are, respectively, 


pa = bi + ds, ba = pi + po. (2) 


There are many such tables, in fact (V + 1)(N + 2)(N + 3)/6. If we ob- 
serve a specific table in which 


(A)=m+n, (B)=mt+m N=Em+m+ngt+m (3) 


and if we limit our consideration of the (NV + 1)(N + 2)(N + 3)/6 tables to 
only that small series of tables for which (3) holds between m, me, ms, m4, 
then only one of the four numbers ; is free to vary and the number of 
tables in the series is only one more than the least of the quantities (A), (B), 
(a), (8). 

Leaving m, undetermined and eliminating me, m3, m; from (1) by (3) we 
may write 


N lp, pb —_m p” —m tad — (A) — (B) +m 


1! nq! nz! n4! 


P 





(4) 


The probabilities 1, pe, ps3, p, in the universe are unknown save that their 
sum must be 1. The value of P may, however, be written as 


e N t pep,“ p” — (A) — (B) eer 


P 
nN! Ny! Ns! ns! pops 





(5) 


As the first fraction of the product contains in the exponents of po, pz, ps 
only quantities which are taken as given by (3) for all the tables which 
are being considered, viz., all with fixed marginal totals, it will not vary 
from table to table in this series by virtue of the values po, p3, 4, no matter 











VoL. 28,1942 MATHEMATICS: WILSON AND WORCESTER 379 


what those unknown values be, but only by virtue of the changes in the 
value of m; and the correlative changes of m2, m3, m4 which occur as factorials 
in the denominator of this fraction. The second factor, however, con- 
tains m, which changes and ‘will be invariant for the series of tables if and 
only if 


Pips = pops. (6) 
In view of the identical conditions (2) taken with 2p = 1, we may eliminate 
pe, ps, ps from (6) leaving an equation in , alone, viz., 
pill + pi — Pa — ba) = (dB — fi)(ba — fr) (7) 
or 
pi = Pabs. (8) 


Thus if we specify that (5) be invariant in so far as it depends upon ; 
from term to term in the series of tables for which marginal totals are given, 
we atrive at the condition that p43 = ppg, namely, that A and B be inde- 
pendent characters in the universe. 

If we had started with the assumption that p; = p4pz then 


(A) » (B) » (a) » (8) 
p = N! ba" ba” ba” Pp 
N,! ne! nz! ny! 





(9) 


and the value of P within the series would depend in a constant manner 
on p4 and pg. That which the proof given in formulas (1) to (8) shows 
is that conversely the relation p; = p,4 pg follows from the general assump- 
tion that P must not vary from term to term in the series because of the 
unknown probabilities but only because of the variation in 1;. 

If our concern with respect to inference from the observations to the con- 
clusion that they do or do not come from an associated universe may be 
limited to a calculation of relative frequencies in this series, we find that 
those relative frequencies are inversely as the product of the factorials of 
the numbers in the table; if we are willing to compute the products of the re- 
ciprocals of the factorials for all terms of the series, the relative probabilities 
may be found, but if the products of the reciprocals of the factorials may be 
summed algebraically, then the formula 


1 -1 1 
te Nemec Nn !no!ng!ng! (10) 


may be used to compute the relative probabilities of the limited number of 
terms in the tail or tails of the series necessary to determine the significance 
or non-significance of the observations in the matter of association of A and 
B in their universe. 
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If we had started with samples of m and N — n in two different point- 
binomial universes with probabilities p; = p, pp = 1 — p, ps = p’, fs = 
1 — ’ in the two pairs of cells coming from the two universes indepen- 
dently,” the probability of the observations would be 


n 1p, bp (N sh n) ! ps” pa” 





P= (1’) 


The definitions of probability make 


bi + po = 1, Ps + fy = 1, (2’a) 


and furthermore if we add conditions (3) fixing the marginal totals m, + m3 
and 2 + m4, of which the latter is not independent of the former, 


(A) = m + m,, (3’) 


and finally the probability of an A in sampling from the two independent 
universes must be 


N - 
P, = ve + hs (2’b) 


which has to be adjoined to (2’a) so that both (2’a) and (2’b) taken together 


correspond to (2). From these equations we may get 


n | (N =e n) ! py” po” - my, (A) - mpN —-(A)-—-n+m 


n,!N2!n3! nN! 





P= (4’) 


from which follows 


pa t(N-2) ! pe" ps oN — (A) —n pips m 
i ny! mo! ns! m4! ea 


Introducing now the assumption that (5’) may not vary from term to term 
in the short series selected from the (m + 1) (NV — n + 1) possible samples 
by the restriction that the marginal totals (A) and N — (A) shall be as ob- 
served, we find again 





(5’) 


pips = paps (6’) 
which by virtue of (2’a) becomes 
pill — ps) = (1 — pips or Pi = ps (8’) 


and shows that the proportions in the two point-binomial universes must 
be the same; it has not been necessary to use (2’b) but this relation fixes 
pi and p3 as equal to p4. The relative probabilities within the series again 
reduce therefore to (10) in which the summation factor must be the same 
because the summation must be over the same values of n; as before.* 
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The general principle seems to be capable of statement as: 


If there be given v cellular universes, Ui, U2, . . ., U,, with unknown prob- 
abilities pu, ..., Pi; Por. - +» Poke «+ +3 Pow - + +» Dok, to the total number 
of T in their cells; and if samples of Mi, Ne, .. ., N, be drawn from them 
with numbers 41, . . ., %1z,3 Mei, . «+» Moby} + + +> My + +> Mykps and if certain 


independent totals to the number of L among the n’s be assigned including 
the v totals N;, the condition that the probability of the samples drawn 
depends in the same way upon the unknown probabilities throughout the 
subseries of tables for which these L totals remain fixed, and which there- 
fore has T — L degrees of freedom, will enable the probability of the samples 
within the series to be written in terms of L unknown probabilities and the 
L assigned totals and will give 7 — L relations between the unknown prob- 
abilities. The relative probabilities within the subseries of T — L degrees of 
freedom are then proportional to the reciprocals of products of the factorials 
of the numbers occurring in the cells of the tables and may be determined 
by computing these values for all tables in the series whether or not an 
algebraic formula for the sum of those reciprocals can be obtained. The 
T unknown probabilities may be expressed in terms of L — v unknown in- 
dependent variables, thus determining the type of universe which repre- 
sents the appropriate null hypothesis. Ifthe Z — v unknown variables be 
assigned values derived from the L relations among the n’s, the unknown 
probabilities and “‘expected values’’ in the samples may be estimated, from 
which the value of x? for the observed table and an approximate value of 
the probability of a table departing from the table of expected values as 
much or more than the one observed may be had. 

Whether so general a principle is of much utility is difficult to say. It 
will allow problems to be solved when the probability set-up is not at first 
clear. For example: Suppose there be given a universe of association 
between two characters and that a sample of N is drawn subject to the 
condition that (A) + (B) = M shall be fixed, i.e., the total number of 
individuals with character A or B or both shall be constant; then 


N! Pi" po bs pa™ N! bi" ps - (4) — mp,(4) “i mid -M+m 
mt mets!) nm! n2!n3!n4! 


is Np! po - jeerey° 
ms n!n2!nz3!nq!| Pods pe 


P 








Hence, by the principle, 


Pits = pabs, be = ps or pips = fr? = fs? 
Di t+ po + bs + fs = 1, 2f1 + Po + Ds = Dus 
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Note that py which is M/N need not be a probability at all if for no other 
reason than that it may exceed 1. Finally, 


bi = ‘/4 bu’, 2 = Ds = V/obm (1 — ‘/opm), ba = (1 — 1/2 Dm)? 
Thus if we had given a table with V = 8 and M = 12as 








A A 
| anitigs ee Ee ; : 
B bre EE fh Heh p/2| 3/2 
| pee ia 
0/22 o [lo witn [/2[1/2 
6 2 8 ee See 8 


as “expected” from n; = Np;, there would be nine possible tables in the 
series for which NV = 8 and M = 12, viz., the two written and these seven 








ae 














7 i] Areata ‘ines: =< _ ] 

eet ee dered A | 
4) 3) Bot Soll Bah dad | 5 0 | Malt Be De 
1fo] {oli| j2fo} jrji] f2f1] [sto] {4fo 








The probabilities for these 9 in the order written are as 
a RS. Aa seb. Warm. age Te 5 
2880 2880 2880 2880 2880 2880 2880 2880 2880 




















or 
"Eee Sy. Ses. aie Re nea ae De 
130 130 130 130 130 130 130 130 130 











whose sum is 1. Now the table written first has P = 0.015 whereas for it 
x? = 8.00 which gives P = 0.019; and the table written second has accumu- 
lated probability for itself and all tables no more probable of P = 0.092 
whereas for it x? = 6.22 corresponding to P = 0.046. 

For this problem it might be difficult to set up a random sampling pro- 
cedure and it might be difficult to perform algebraically the sum of 
(7 ! m2 ins !n,!]—! subject to the conditions In; = N and 2m; + mz + ms = 
M or ns — m = (a8) — (AB) = N — M, but the principle gives a solution 
for the relative probabilities and for the expected table and if one is willing 
to accept that solution and the further rule that the significance of a table 
in the spread of two degrees of freedom is to be determined by the sum of 
the probabilities of the table and all tables no more probable, the two rules 
taken together will give a test of significance. Whether it is an asset or a 
liability to have rules which will do such things under such general condi- 
tions only time can tell. No rules for inference which have this far been 
proposed, such as the rule of equal distribution of ignorance, the rule of 
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inverse probabilities, the rule based on confidence intervals or regions, 
and possibly even the Yates-Fisher rule, have stood a long test of time and 
remained acceptable. 


1 An Introduction to the Theory of Statistics, by G. U. Yule (or new edition by G. U. 
Yule and M. G. Kendall), Chaps. I-V, London, Charles Griffin and Co. 

2 Yates, F., Suppl. Jour. Roy. Statist. Soc. London, 1, 217-235 (1934). The proof we 
give is the converse of that given by R. A. Fisher, Séatistical Methods for Research 
Workers, Art. 21.02, in which he assumes p = 9’. 

3 In a recent note (these PROCEEDINGS, 28, 94-100 (1942) footnote 7) it was pointed 
out in discussing a 2 X 3 table that if the observations 





m | nz | ns m + m2 + m3 = 
| | : I 
| nN4 | Ns | née mg + n5t+ ng= N-—n 


nm, +m = (A) m+ ns = (B) ns + me = N — (A) — (B) 


| 
2 











were made, the values of the relative probabilities involved would not depend in 

final analysis, according to the general principle adopted, on whether the table had arisen 

(1) from a true sixfold universe with probabilities 1, +2, 3, 74, 75, 7s, With 2x = 1 from 
(N + 5)! 


which _eINT tables of NV elements could be drawn, or (2) from a pair of trinomial uni- 
oO ea . 
verses from which m and N — n elements, respectively, were drawn with the six prob- 
(n + 2)!(N—n+2)! 
(2!)?n!(N— n)! 
could be found or (8) from a set of three point-binomial universes from which (A), (B), 


N — (A) — (B) elements, respectively, were drawn with the six probabilities connected 
by m + ms = 1, m2 + ws = 1, 23 + 26 = 1 Of which 


abilities connected by wm, + m2+ 23 = 1, m1+ 75 + 26 = lof which 





((A) + 1] ! ((B) + 1) 1[N — (A) — (B) +1)! 
(A) 1(B)![N — (A) — (B) + 1)! 





could be found. 


It is instructive to work out the second case in detail. We have 


n ! (N ‘aii n) ! ag ag a se” 











as 1” 
a N,!ne!nz3! ng! ns! ne! ( ) 
mtrmtm=1m+a+ m = 1, (2a) 

N- N- 
TA = ra + = ™, 7B = x” + v ~ it (2"b) 
(A) = m + m, (B) = no + ns. (3”) 


By virtue of (3”) 


ni(N—n) on™ aya, — % — m2, (A) _ m, (B) - nay NV — (A) — (B) — 2+ m+ m 








P= 
ny! ne!nzs!ng! ng! n6! 
(4”) 
n!(N — 2)! wate’) aD — (A) — (B) — {22 [ (=) (5”) 
nm! n2!ng! ng! m5! m6! R3m4 Ws 
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Hence 


— pico (6”) 


by virtue of the principle that for variation within the series of two degrees of freedom 
depending on m, and me, P must not change no matter what be the unknown probabili- 
ties in the universe or universes. Using next the relations (2”a) one finds 


m (1 — me — ms) = (1 — m1 — 22)m4, w2(1 — my — 5) = (1 — 4 — me) 05 
m(1 — ms) = (1 — me)ms, m2(1 — as) = (1 — mm1)a5. 


Hence 
7, = 14, 2 = wh, and m3 = 7, 
and from this by (2b) the probabilities 7,, 72 may be shown to be 74, 7B. Then 


n!(N — n)'(1 — ra — wp)"raA“ ep (1 — 2a — rp)N ~— 4) -— B) —* 
nN Ino! ng! ng! nging! 





P= 


which is precisely the expression that would have been written down in the first place 
if it had been assumed at the start that m, = a, = 7A and m2 = m5 = 7B. 

The expression may be summed by writing down the probability of dividing N into 
(A), (B), N — (A) — (B), viz., 


NtraAaep (1 — 24 — xp)N — (A) — (B) 
(A) !(B)![N — (A) — (B)]! 





Thus the relative probability within the series of two degrees of freedom is the quotient 


_ (A) 1(B) 1 LN — (A) — (B)]!n 1(N — 2)! 
s- Ni! my! me! ng! mq! ng! ng! 





R 


and would in fact be the same on either of the other two suppositions as to the origin of 
the observed members 7;. 


THE ASSOCIATION OF THREE ATTRIBUTES 
By Epwin B. WILSON AND JANE WORCESTER 
HARVARD SCHOOL OF PuBLIC HEALTH 


Communicated August 14, 1942 


If from a universe of individuals with or without three characters A, B, C 
there be drawn a sample of N, the six numbers (A), (B), (C), (a), (8), (y), 
with or without each character will be known, as will the twelve numbers 
(AB), ..., (@y) of those with or without each pair of characters, and the 
eight numbers (ABC), ..., (a8) which specify the primary populations 
in the sample. Corresponding to these eight types of individuals there 
will be eight (presumably unknown) probabilities in the universe, paz, etc. 
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The chance of drawing the particular sample is 


ba NW ifa™”..  Papy” 
(ABC) !...(aBy) ! 





(1) 


When discussing the statistical significance of the sample, which may be 
considered as a 2 X 2 X 2 table, certain values taken from the sample 
must be ‘‘used.’”” There are four cases which we propose to discuss: 

1. JN, (A), (B), (C)—the grand total and the ‘‘edge’’ subtotals. 

2. (AB), (@B), (AB), (a8), (C), (y)—one marginal “face” and the 
complementary edge. 

3. Two marginal faces—those of A, B and A, C. 

4. All three marginal faces. 

In the discussion it will be assumed that the sample probability (1) must 
be independent of the values of the unknown cell probabilities for all varia- 
tions of the sample consistent with the marginal totals used.' 

1. The Edge Subtotals and the Grand Total Used.—There are four cell 
frequencies which may be assigned values consistent with these totals; 
they may be taken as? (ABC), (BC), (AC), (AB). Then (1) becomes 


N! Ppasct®pasc®e — ABCH agoAe — ABCH 4'p 48 - ABCH ag, 4 — AB — AC + ABCSX 
Ap a — +A “A B= £8 * ~ 
PaBy® AB— BCT ABCH go®— AC — BC BCb.g,% A c AC + BC ABC 


ABC !aBC ! ABC! ABy !ABy ! aBy ! aBC ! aby! 





where the parentheses which Yule uses to designate numbers have been 
dropped, and will be reintroduced only when clarity requires it. The ex- 
pression in » may be written as 


(PascbsprberrPase)**( Bambatr )*¥( Papsberr 7 
pa ByP apcPaBcPapy P AByPaBy P ApyPage 


(fezaPaor "(ser '(fatr)"{ Pate 'p,, ; 
PasyPapc! \Papy/ \Papy/ \Paby/ 


and if the probability is not to vary with ABC, AB, AC, BC one has 








DaByPapy _ PapcPaby _ PascPapy a Paschaspo _ 








P ApyPaBy A P pyPape PabyPape PapcPasc 


These conditions taken with 2p = 1 express the interrelations of the eight 
p’s which are implied by the marginal totals used and the principle of 
independence assumed. The most suggestive way to solve in terms of 
three independent values of p seems to be to set 


fe ky Be Ch fey B, 
Papy N Papy q2 Papy q3 
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where p; + g; = 1. Then the first three conditions will give pis,/Pag,; 
papc/Papy, Pasc/Papy and the fourth pasc/pagy whereupon Zp = 1 
makes pag, = %92¢3 and each probability becomes the product of three 
p’s and q’s. 

The values of the cell probabilities remain unknown but expressed in 
terms of a reduced number, namely three, as pasc = pipops, ..-, Paby = 
919293. There are four degrees of freedom. One may compute the relative 
probabilities of all the samples which may occur subject to the marginal 
totals specified; this may be done by writing down all the tables and taking 
their probabilities as inversely proportional to the product of the factorials 
of the cell frequencies, or it may be done by noting that the probability of 
the given totals A, B, C arising in three independent partitions of N is 


N ! piAq*N ! poBqs°N ! ps@qs” 
Ala!B!B!IC!y! 





and hence the relative probability of any table must be the quotient of 

(1) by (2) which reduces to® 
Ala!B!B!C!y7! 
(N!)*ABC!... aBy! 





(3) 


If finally it is assumed that those tables which with all equally or less 
probable tables have a total (relative) probability no greater than 0.05 or 
some other assigned value are significant at the level of that value, the 
question of significance can be settled. Obviously from the form pasc = 
Pipeps, -- +5 Pay = 1929s in which the cellular probabilities can be written 
it is clear that the significance in question has to do with the complete 
independence or not of the three characters in the universe.‘ Had the 
assumption of complete independence been made in the first place it would 
have been seen that (1) was independent of 1, pe, p3 and that (3) followed as 
the value of the relative frequency.® 

2. One Marginal Face and the Complementary Edge Used.—Of the six 
numbers AB, aB, AB, a8, C, y only five are independent as the sum of the 
first four is equal to that of the last two andis N. There are three degrees 
of freedom in the sense that in terms of the values* of ABC, AC, BC, one 
has ABy = AB — ABC and 


aBC = BC — ABC, ABC = AC — ABC, aBC = C— AC— BC+ ABC, 
aBy = aB — BC+ ABC, ABy = AB — AC+ ABC, 
aBy = aB —-C+AC+ BC — ABC. 


It thereupon turns out that pasc = pup, Pasy = pug,---5 Papy = pw2g 
in terms of four unknowns pu, pi, Poi, P22 Whose sum is 1 and an additional 
unknown p. The value of the relative probability of a table is 
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AB !aB!Ap!aB!Cly! 
N!ABC!...aBy! 





(4) 


The problem is to discuss whether the samples of C and of y come from 
the same 2 X 2 universe’ of A and B. The expected value of (ABC) is 
Npup and is unknown, but if p and p be estimated from the marginal data 
it would be (ABC) = (AB)(C)/N and similarly for the other tabular ele- 
ments, from which x? could then be computed. 

3. Two Marginal Faces Used.—Of the eight elements in the two 
marginal faces only six are independent; they may be taken as AB, aB, 
AB, aB, AC, aC, leaving two degrees of freedom to be specified by ABC 
and BC. As in the first case eight unknown values of p could be expressed 
in terms of three independent unknown probabilities and in the second case 
in terms of four, so here they can be expressed in terms of five. Indeed, the 
part of P in (1) that depends on the probabilities in the universe may be 
written 


Pape**“pan,4” = amc id? cai oF. ABCH ach? -~ ADE Digs” =- AS — AC. + £00 d 


B—AB— BC + ABC c — AC — BC + ABC N-A—B-—C+4AB+ AC + BC — ABC 
Paby Pape Papy 
or ( ParebserPanbote \/ Pascbats «x 

PapyPascp agcP aBy PabyPape 


B es re 
Pasct*page*’p apy 8Pany* Pape “Papy - ut 





whence 


PascPapy,  PapcPaBy _ 


PabyPape PascPapy 





These equations are the conditions that B and C are not associated in the 
subuniverses of a and of A. From this it follows that 





pasc = 


PasPac ee PaBPac st os PabPay. 
Pa a Pa ’ apy Pa 
The chance of getting the partition AB, aB, AB, a6 from N and the parti- 
tions AC, Ay from A and aC, ay from a is 

N! A! a! 
AB!aB!AB!aB!AC!Ay! aC! ay! 





and hence the relative frequencies are 


AB !aB!AB!aB!AC!Ay!aC!ay! 
Al @! Age? 24. er! 
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The problem here is whether the observed 2 X 2 X 2 table could have arisen 
from two different non-associated universes,* one in A and the other in a. 

4. The Three Faces Are Used.—Here there is only one degree of freedom 
which may be specified by ABC; there are seven independent unknown 
frequencies. The condition for independence of P of the unkown p’s 
is readily shown to be 


Pascp apyPabyPape ie  PABch sby bs PaBcPaby. 
Papyt 4ByPaBcP ape PaByPape PapcPaby 








The association coefficients of Yule, namely, 


9 AB-a8 ~ oB-AB V ABV oB—V aBV AB 
= Qo = —S ——= 
AB-aB + aB-Ap V ABV ab+V aBV AB 


are both functions of the quotient AB-a8/aB-Af. The condition just 
found for the p’s shows that B and C have the same association coefficients 
in the subuniverses of A and a (and because of symmetry A and C have 
the same association coefficients in the subuniverses of B and 8, and so for 
Band Cin A anda). The problem is to determine whether the observed 
table could reasonably have arisen from such a 2 X 2 X 2 universe.2 We 
have not yet been able to find a formula for the relative probabilities of all 
the tables in the linear series but when numbers are small the reciprocal 
of the product of the factorials of the cell frequencies of all tables of the 
series may be tabulated and the relative frequencies found.” 

Although the value of the probabilities in the universe must remain un- 
known as in previous cases because the very essence of this approach is 
that the results must be independent of those unknown probabilities, one 
may assign values to them in terms of the marginal elements that are used, 
and these marginal elements would define the probabilities if the sampling 
were really from a universe in which the marginal elements used were 
really fixed by external constraints instead of appearing by virtue of the 
sampling process. For the present case the expected value Npasc or 
ABC would then be the value of x which satisfies the cubic equation 








(ABC — x)(ABy — x)(aBy — x)(aBC — x) = (aBy + x)(ABy + x) 
(aBC + x)(ABC + x) 


or the value of pazc would come from the cubic equation 


Pasc® — pasc*[pas + pac + pac — pabs — pabc — pabc + Papac + 
paPec + Pcpas| + Pascl2pacPaspac + paspac + Pashsc + pachsc + 
Papac? + papsc® + pcpas® + papspc — papchas — Papchsc — PabcPaz — 
Pebchac — PaPspac — PaPsPacl] — PasPacPac([1 — pa — pa — po tpast 

pac + Pacl = 0. 
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The value of x must lie between the least of the four frequencies ABC, 
ABy, aBy, aBC and the negative of the least of the four frequencies afy, 
ABy, ABC, aBC. It may be observed that the expected value is here not 
the mean value"! in the series as it is in the previous three cases and in the 
m X n contingency table. 


1 See, these PROCEEDINGS, 28, 378-384 (1942). 


2 For example, if the table be ABC = 0, ABy = 0, ABC = 0, aBC = 0, ABy = 1, 
aBy = 2, aBC = 3, aBy = 1 there are 10 tables consistent with N = 7,A = 1, B = 2, 
C = 3, viz., 

ABC ABy ABC aBC ABy aBy aBC aBy 
1 0 0 0 0 1 2 3 1 
2 0 0 0 1 1 1 2 2 
3 0 0 0 2 1 0 ] 3 
4 0 0 1 0 0 2 2 2 
5 0 0 1 1 0 1 1 3 
6 0 0 1 2 0 0 0 4 
7 0 1 0 0 0 1 3 2 
8 0 1 0 1 0 0 2 3 
9 1 0 0 0 0 1 2 3 
10 1 0 0 1 0 0 1 4 


3 The relative probabilities of the above tables are 
1 2 3 4 5 6 7 8 9 10 


4/49 12/49 4/49 6/49 8/49 1/49 4/49 4/49 4/49 2/49 
Only set-up 6 is significant at the 0.05 level with a relative probability of 1/49. 


4 If it were desired to use the marginal values to estimate p; = p,4 as(A)/N, etc., 
the expected value of (ABC) would be (A) (B) (C)/N?, etc., and then x? could be com- 
puted. 


5 If there were k characters, the number of degrees of feedom for complete independence 
would be 2* — k — 1 and the relative probability would be 
Ala!B!6!...K!«! 
(N!)F-AB...K)!...(aB...«) 1 





Even with small numbers there would be serious difficulty in carrying out the arithmetic 
for a particular case. 

6 For example, let N = 6, C = 4,A = 3,B =3,AB=2. There are consistent with 
these boundary conditions 8 tables which are as follows: 


ABC ABy ABC aBC ABy apy aBC aby 
1 1 1 1 0 0 1 2 0 
2 0 2 1 0 0 2 0 
3 2 0 1 0 0 1 1 1 
4 2 0 0 0 1 1 2 0 
5 1 | 1 1 0 0 1 1 
6 1 1 0 1 1 0 2 0 
7 2 0 1 1 0 0 0 2 
8 2 0 .-9 1 1 0 1 1 
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The relative probabilities of these tables are 
1 2 3 4 5 6 7 8 
2/15 1/15 2/15 1/15 4/15 2/15 1/15 2/15 
7 The natural generalization would be to consider whether k tables containing C,, 
C2,..., Ck elements could arise from the same 2:X 2universe. The relative probabilities 
of the different 2 X 2 X k tables would be 


AB!aB!AB!aB!Q!G@!...C,! 
N! X product of factorials of cell frequencies 





8 The natural generalization would be to consider whether a 2 X 2 X k table could 
have originated from k non-associated universes by taking samples of Ai, Ax,..., AR 
from them and the relative probabilities would be 


A\B!A,B!AiC ! Ary !... A,B !A,8!A,C ! Ary ! 





A; !a,!...A,!a,! X product of factorials of cell frequencies 


9 The natural generalization would be to & characters and a 2" table. The one condi- 
tion on the 2* values of the cell probabilities in the universe would be that the product 
of the 2* - 1 values of for cells correponding to k, k — 2, k — 4,... positive characters 
would have to be equal to the product of the 2 — ! values of p for cells corresponding to 
k —1,k —3,k — 5,... positive characters. It is to be understood that all combina- 
tions of k — 1 characters are used. 

wif N = 45,A = 19, B = 14, C = 20, AB = 5, AC = 8, BC = 7 there are 6 tables 
consistent with these boundary conditions which are as follows: 


ABC ABy ABC aBC ABy aBy aBC apy 

0 3 11 7 ‘ 10 7 

10 ) 8 

9 

10 

: 11 
6 0 é ) 12 


The relative frequencies of these tables as determined from the reciprocal of the product 
of the factorials of the cell frequencies are 

1 2 3 4 5 6 

48 1925 11550 13860 3360 126 


30869 30869 30869 30869 30869 30869 
Numbers 1 and 6 are significant at the 0.05 level. 
11 Note that the mean value of ABC in the above series is 2.8865 whereas the ex- 
pected value as determined from the cubic equation is 2.3884. 











